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Management  Summary 


DTIC  has  undertaken  a  three-phase  effort  to  improve  the  quality  of  its  subject  indexing. 
This  report  describes  baselining  of  the  present  indexing,  the  first  phase  of  the  effort.  Later 
phases  are  to  include  development  of  an  indexing  philosophy  and  identification  of  access 
methods  which  can  be  integrated  with  the  Electronic  Document  System  (EDS),  which  is 
presently  under  development.  : 

In  order  to  address  the  issue  of  indexing  usability,  the  methodology  selected  was  that  of  the 
recall/precision  study.  A  sample  of  19  queries  was  selected  from  the  query  files  at  DTIC. 
Queries  were  searched  in  both  the  Technical  Report  (TR)  Bibliographic  Database  and  Work 
Unit  Information  System  (WUIS);  retrieval  was  limited  to  unclassified  citations. 

The  original  design  called  for  queries  to  be  searched  by  two  groups  of  expert  searchers 
(searchers  with  extensive  experience  in  searching  the  DROLS  system):  DTIC  searchers  who 
were  instructed  to  search  broadly  with  the  goal  of  retrieving  as  many  relevant  documents  as 
possible  (data  acquisition  searchers);  and  searchers  at  user  installations  who  were  instructed 
to  search  as  they  would  for  a  real  user,  insofar  as  possible  (test  searchers). 

This  design  assumed  that  there  would  be  significant  differences  in  search  strategies  between 
broad,  inclusive  searching,  and  searching  carried  out  under  real-life  constraints.  Had  this 
been  the  case,  overlap  of  test  searches  with  data  acquisition  searches  would  have  been  very 
high,  but  test  searches  would  have  retrieved  fewer  citations  than  data  acquisition  searches. 
It  was  intended  that  recall  and  precision  would  be  calculated  for  test  searchers  only,  because 
they  were  the  ones  to  be  searching  under  more  "realistic"  conditions. 

In  actual  fact,  neither  the  search  strategies  nor  the  overall  characteristics  of  their  retrieval 
differed  significantly  between  the  two  groups  of  searchers.  Data  acquisition  and  test 
searchers  retrieved  similar  numbers  of  hits  for  each  query,  and  examination  of  their  search 
strategies  showed  no  clear  differences  between  the  two  groups. 

Therefore  the  retrieval  of  the  two  groups  of  searchers  was  amalgamated  for  analysis,  and 
recall  and  precision  were  calculated  for  all  searchers.  The  wide  variation  in  search  strategies 
described  below  means  that  the  retrieval  of  all  searchers  combined  is  reasonably  broad. 
Analyzing  all  searchers’  results  for  recall  and  precision,  rather  than  just  those  of  the  test 
searchers,  provides  a  broader  set  of  findings  for  interpretation,  and  should  compensate  for 
any  effect  of  the  loss  of  the  distinction  between  the  two  search  phases. 

The  variation  in  search  strategies  was  great,  not  so  much  in  actual  choice  of  concepts  to  be 
searched  as  in  refinements  such  as  use  of  truncation,  inclusion  of  IAC  terms,  and  searching 
of  words  in  titles  or  abstracts.  Some  of  these  variations  are  attributable  to  design  of  the 
indexing  system,  while  others  reflect  features  of  the  search  system. 

Hits  were  downloaded  —  with  some  difficulty  —  and  processed  into  a  database  to  permit 
identification  of  the  fields  in  the  downloads.  Reports  containing  subject  and  narrative  fields 
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were  generated  for  judging  of  the  relevance  of  the  citations.  If  over  100  unique  citations 
were  retrieved  for  a  query,  a  random  sample  of  100  citations  was  selected  for  judging. 

Relevance  judges  were  subject  experts  located  at  five  of  the  DTIC-supported  Information 
Analysis  Centers  (LACs).  Each  set  of  hits  was  evaluated  by  two  experts  located  at  the  same 
LAC.  Because  concurrence  between  the  judges  was  very  low  in  numerous  cases,  recall  and 
precision  values  were  calculated  both  as  if  a  relevant  document  were  one  judged  as  relevant 
by  both,  and  as  if  it  were  one  judged  relevant  by  either  judge. 

By  either  criterion  of  relevance,  the  recall/precision  values  did  not  show  the  conventional 
inverse  relationship  between  precision  and  recall.  Overall,  mean  recall  for  each  query  was 
low  to  moderate,  with  great  variation  in  precision  from  very  low  to  very  high.  The  relatively 
low  mean  recall  is  particularly  striking  when  it  is  considered  that  the  method  of  locating 
relevant  documents  was  necessarily  quite  limited.  The  documents  retrieved  by  the  searches 
in  the  study  formed  the  base  from  which  relevant  documents  were  selected;  relevant 
documents  in  the  database  which  were  not  retrieved  could  not  enter  into  the  evaluation. 

Even  though  specifics  cannot  be  determined  at  this  time,  it  is  reasonable  to  conclude  that 
improvements  to  both  indexing  and  the  search  engine  are  warranted.  The  difficulties 
encountered  in  downloading  indicate  that  there  are  problems  somewhere  in  the 
telecommunications  system  as  well. 

Phase  Two  will  provide  more  definitive  answers  as  to  the  causes  of  the  retrieval  inadequacies 
encountered,  at  least  for  the  documents  which  were  actually  retrieved  by  at  least  one 
searcher.  Those  which  remain  unknown,  of  course,  cannot  be  evaluated.  In  Phase  Two,  a 
sample  of  retrieved  documents  will  be  evaluated  for  each  query.  The  reasons  for  failure  of 
a  given  search  to  retrieve  a  relevant  document,  or  for  its  retrieval  of  a  nonrelevant 
document,  will  be  recorded  and  categorized.  It  is  anticipated  that  there  will  be  two  broad 
categories:  indexing  failures  and  search  failures,  with  several  subcategories  such  as 
vocabulary  inadequacy  (indexing)  or  failure  to  truncate  (search). 

A  philosophy  of  subject  indexing  for  DTIC  will  also  be  developed  in  Phase  Two.  This  effort 
will  include  extensive  consultations  with  DTIC  units  such  as  subject  analysis  as  well  as  DTIC 
users. 
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Introduction 


Background 

When  asked  how  well  DTIC  provides  intellectual  control  over  the  subject  content  of  its 
databases,  nearly  every  end  user,  intermediary,  indexer,  or  DTIC  manager  can  offer  an 
opinion  of  some  sort  on  the  quality  of  DTIC’s  indexing.  These  responses,  however,  typically 
have  only  two  characteristics  in  common: 

•  High  subjectivity,  based  on  impressionistic  data  rather  than  on  hard  evidence. 

•  A  consensus  that,  if  it  is  to  function  in  the  future  as  an  effective  information  provider, 
DTIC  must  find  ways  to  modernize  its  control  over  the  subject  content  of  its 
databases. 

Based  on  this  consensus,  DTIC  has  undertaken  to  improve  the  quality  of  its  indexing.  The 
effort  is  seen  as  involving  three  phases: 

•  Determination  of  the  baseline  quality  of  the  present  indexing. 

•  Development  of  an  indexing  philosophy. 

•  Identification  of  access  methods  which  can  be  successfully  integrated  with  the 
Electronic  Document  System  (EDS),  which  is  presently  under  development. 

This  report  describes  the  first  phase  of  this  effort,  baselining  of  the  present  indexing. 


Present  Indexing  at  DTIC 

DTIC’s  databases  are  processed  through  a  machine-aided  indexing  (MAI)  program  which 
selects  candidate  index  terms  from  the  DTIC  Thesaurus,  based  on  matching  of  words  and 
phrases  in  titles  and  abstracts  or  full  text.  In  the  Technical  Report  (TR)  Bibliographic 
Database,  titles  and  abstracts  are  processed.  Human  editors  review  the  candidate  terms,  and 
may  add  or  delete  terms  to  reflect  the  actual  content  of  the  document.  Documents  included 
in  the  Work  Unit  Information  System  (WUIS)  are  shorter,  and  the  full-text  record  is  entered 
into  the  system.  Indexing  for  these  records  is  mounted  on  the  database  without  human 
review;  i.e.,  it  is  automatic.  In  addition  to  being  run  through  the  MAI  system,  the  words  in 
the  text  of  the  records  are  also  inverted.  The  Independent  Research  &  Development 
(IR&D)  Database  is  also  indexed  with  MAI  and  no  human  review. 

The  system  for  selection  of  index  terms  has  been  well-reported  in  the  literature  (Klingbiel, 
1973;  1985;  Klingbiel  &  Rinker,  1976;  Jacobs,  1990).  This  system  was  developed  in  the 
1970’s,  and  was  updated  to  some  extent  in  the  early  1980’s.  In  its  current  mainframe 
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version,  it  uses  both  phrase  structure  and  transformational  grammars  to  extract  natural 
language  phrases.  These  are  then  processed  against  a  dictionary  to  produce  descriptors. 
Processing  is  against  titles  and  abstracts  only  in  the  TR  Bibliographic  Database,  and  against 
full  text  in  WUIS  and  IR&D.  Since  the  original  system  was  developed,  the  state  of  the  art 
of  text  processing  has  advanced  significantly,  in  tandem  with  growth  in  computing  power  that 
permits  more  sophisticated  processing. 

In  September  1992,  a  PC-based  MAI  system  which  is  limited  to  word  matching  was 
implemented.  The  PC-based  system  represents  a  lower  level  of  functionality,  developed  to 
permit  indexing  on  PCs;  it  is  applied  only  to  the  TR  Bibliographic  Database. 

Some  records  in  TR  are  originated  and  indexed  by  IACs,  using  a  specialized  vocabulary  in 
addition  to  DTIC  Thesaurus  terms.  Searching  on  these  terms  requires  use  of  a  special 
mnemonic  because  the  terms  are  stored  in  a  separate  field  from  the  DTIC-assigned  terms. 
While  it  is  also  possible  to  limit  a  search  by  specific  IAC(s),  this  capability  was  not  relevant 
to  this  study,  which  was  limited  to  subject  indexing. 

Other  TR  records  are  originated  by  members  of  DTIC’s  Shared  Bibliographic  Input  Network 
(SBIN),  using  the  same  indexing  policies  as  DTIC. 


Plans  for  Indexing  Development  at  DTIC 

As  indicated  above,  the  purpose  of  the  present  study  is  to  determine  the  baseline  quality  of 
the  present  DTIC  indexing  system.  It  is  important  to  be  aware  of  the  problems  which  are 
present  in  order  to  make  effective  recommendations  for  improvement.  Improvements  will 
affect  SBIN  records  as  well. 

Work  is  underway  at  DTIC  on  a  system  for  electronic  storage  of  documents,  of  which  subject 
retrieval  will  be  an  important  component.  However,  this  system  will  not  be  fully  operational 
much  before  the  end  of  the  decade,  and  it  is  desired  to  make  improvements  in  the  indexing 
system  much  sooner  than  that  —  and  to  make  these  improvements  in  a  way  that  permits  as 
much  as  possible  of  the  effort  to  be  carried  over  into  the  new  system. 


Methodology 


Recall/Precision 

The  most  important  question  to  be  addressed  was  the  usability  of  the  subject  indexing  DTIC 
produces,  not  its  "quality*  as  an  abstract  concept.  It  was  therefore  important  to  select  a 
methodology  that  would  permit  determination  of  how  well  the  indexing  aids  the  information 
retrieval  process.  The  most-tested  methodology  that  can  meet  this  criterion  reasonably  well 
is  the  recall-precision  study. 
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This  method  has  the  longest  history  —  and  therefore  the  best  knowledge  of  both  its  strong 
points  and  its  pitfalls.  Such  a  study  attempts  to  quantify  two  characteristics  of  retrieval:  the 
proportion  of  the  relevant  records  in  a  database  which  were  actually  retrieved  by  a  search, 
and  the  proportion  of  the  records  retrieved  which  are  actually  relevant  to  the  information 
need.  A  "relevant"  record  is  one  which  bears  on  the  query;  the  required  closeness  of  match 
between  record  and  query  is  usually  a  subjective  judgment;  a  test  may  allow  for  degrees  of 
relevance. 

Recall  is  defined  as: 

^xlOO 

Precision  is  defined  as: 


where  Rrel  is  the  number  of  relevant  records  retrieved  in  answer  to  a  query,  is  the  total 
number  of  documents  in  the  system  that  are  relevant  to  that  query,  and  Trel  is  the  total 
number  of  records  retrieved  by  the  query. 

These  measures  have  some  limitations.  First  and  foremost,  in  a  practical  situation,  it  is 
impossible  to  judge  every  document  in  a  database  for  relevance  to  every  query  in  a  test. 
Instead,  approaches  which  seek  to  retrieve  as  many  relevant  records  as  possible  are  devised. 
These  include  searching  the  same  query  in  multiple  databases  —  if  the  document  records  are 
included  in  more  than  one  database  —  or  searching  the  queries  very  broadly  or  with  multiple 
strategies,  concentrating  on  retrieval  of  as  many  relevant  records  as  possible,  with  little 
concern  for  the  number  of  irrelevant  records  retrieved.  Records  retrieved  by  these  devices 
are  evaluated  for  relevance,  and  become  the  basis  against  which  test  searches  are  judged. 

"Relevance"  itself  can  be  interpreted  in  several  ways,  but  the  one  most  appropriate  here 
is  topical  relevance:  a  relevant  record  is  one  which  bears  on  the  query.  Froelich  (1994),  in 
his  paper  introducing  a  special  issue  on  relevance  of  the  Journal  of  the  American  Society  for 
Information  Science,  makes  the  point  that  topical  relevance  is  the  basis  for  other  forms  of 
relevance,  such  as  user  relevance  (i.e.,  relevance  to  the  user’s  information  need  at  the 
moment  of  accessing  the  information).  User  relevance  changes  as  more  information  is 
accessed.  For  example,  if  two  documents  happen  to  contain  essentially  the  same 
information,  both  are  topically  relevant.  However,  only  the  first  document  seen  may  be 
relevant  to  the  user;  the  second  will  be  repeating  information  which  the  user  already  knows. 
This  makes  user  relevance  order-dependent,  and  would  require  that  the  information  system 
have  in-depth  awareness  of  the  user’s  knowledge  state  and  its  changes  from  moment  to 
moment. 
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Since  information  retrieval  systems  are  only  partially  successful  today  at  producing  high 
topical  relevance,  the  first  priority  should  be  fulfillment  of  this  basic  criterion. 

An  exhaustive  research  project,  supported  by  the  National  Science  Foundation,  to  study 
information  seeking  and  retrieving,  was  reported  by  Saracevic  (1987,  1988).  The  retrieval 
effectiveness  aspects  of  the  Saracevic  study  provided  the  framework  for  the  present  study. 
Topics  such  as  question  typologies  and  searchers’  cognitive  traits  which  were  covered  in 
Saracevic’s  work  are  not  germane  to  the  analysis  of  indexing  quality  and  were  not  included. 

In  the  present  study,  real  queries  were  searched  exhaustively  by  expert  searchers;  then  the 
same  queries  were  searched  in  a  more  typical  search  situation.  All  of  the  retrieved  citations 
were  amalgamated  and  subject  experts  judged  the  topical  relevance  of  either  a  sample  or 
the  entire  set,  depending  on  the  size  of  the  retrieved  set.  Given  the  impossibility  of 
evaluating  every  document  in  a  large  collection  for  relevance,  this  strategy  cannot  be 
assumed  to  locate  every  relevant  document  in  the  system.  However,  it  is  the  approach  which 
has  been  most  successful  in  previous  research,  when  it  was  necessary  to  limit  searching  to 
a  single  database.  (If  it  is  possible  to  search  for  the  same  documents  in  multiple  databases, 
a  greater  exhaustivity  of  retrieval  can  be  attained.) 

For  this  study,  Rton  or  total  relevant  records,  was  taken  to  be  all  the  relevant  records 
retrieved  by  all  of  the  searches. 


Scope  and  Limitations 

The  study  was  limited  to  unclassified  documents,  because  it  would  not  be  feasible  to  attempt 
to  judge  relevance  from  sanitized  data.  It  was  determined  that  there  is  no  difference  in 
indexing  policies  between  classified  and  unclassified  documents.  Both  the  TR  Bibliographic 
Database  and  WUIS  were  included,  but  the  IR&D  Database  was  excluded  to  avoid 
problems  of  access  to  proprietary  information. 

In  order  to  assure  that  real  queries  were  the  subject  of  the  test,  they  were  drawn  from  the 
files  of  search  queries  at  DTIC.  A  limitation  of  this  procedure  as  compared  with  an  actual 
search  situation  is  that  it  is  not  possible  for  the  searcher  to  interact  with  the  user  to  refine 
the  query  or  determine  if  the  retrieval  is  on  target. '  While  this  limitation  led  to  some 
concern  on  the  part  of  the  searchers,  it  also  made  it  possible  to  focus  on  quality  of  indexing, 
rather  than  on  searchers’  ability  to  compensate  for  lack  of  quality. 

It  also  was  necessary  to  limit  the  topics  of  the  queries  to  those  for  which  expert  judges  were 
available  to  judge  the  relevance  of  the  results.  DTIC  funds  and  manages  a  number  of  the 
DoD  Information  Analysis  Centers  (IACs),  and  it  was  determined  that  the  best  resource  for 
relevance  judging  would  be  the  staff  of  these  IACs.  Only  queries  which  could  be  judged  by 
experts  at  the  IACs  were  selected  for  searching.  A  subset  of  the  IACs  at  which  particularly 
expert  searchers  were  available  was  selected  with  the  assistance  of  Dr.  Forrest  Frank  of 
DTIC. 
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While  100-record  samples  of  the  citations  retrieved  by  large  searches  were  evaluated  for 
relevance,  rather  than  the  entire  set  of  the  retrieval,  appropriate  statistical  treatments  were 
applied  to  show  the  confidence  with  which  the  results  can  be  interpreted  as  applying  to  the 
entire  set  of  retrieved  citations.  These  are  the  confidence  intervals  shown  in  die  various 
tables  in  this  report.  While  the  samples  themselves  ranged  from  15  percent  to  100  percent 
of  the  hits  for  a  query,  absolute  sample  size  and  variance  in  the  results  are  more  important 
to  statistical  inference  than  the  percentage  of  the  universe  covered  by  a  sample,  so  long  as 
it  is  randomly  selected. 

Procedure 

The  following  steps  were  followed  in  gathering  and  analyzing  the  data: 

1.  Selection  from  the  DTIC  query  files  of  a  set  of  queries  to  be  searched. 

2.  Searching,  divided  into  data  acquisition  and  test  phases. 

3.  Relevance  judging. 

4.  Statistical  analysis  and  inference. 

Appendix  4  shows  one  search  strategy,  the  number  of  hits,  and  calculation  of  precision  and 
recall  statistics  for  that  search. 


Query  Selection 

In  August  1993,  the  DTIC  query  logs  (queries  phoned  in  by  users  for  search  by  DTIC  staff) 
were  reviewed  to  make  a  tentative  selection  of  queries  which  were  within  the  scope  of  the 
IACs  to  be  involved,  with  the  goal  of  having  15-25  queries  in  the  final  set.  Logs  for 
September  1992,  and  for  January,  April,  and  July  1993,  were  sampled  by  looking  at  every 
tenth  query  for  key  words  that  indicated  the  subject  of  the  search.  Since  the  query  log 
includes  only  a  brief  title,  many  searches  were  rejected  at  this  point  for  lack  of  information. 

The  following  criteria  were  established  for  queries: 

•  The  query  should  be  within  the  scope  of  one  of  the  IACs  selected  for  participation. 

•  A  balance  should  be  maintained,  with  a  goal  of  3-6  queries  per  IAC. 

•  The  query  should  appear  to  have  retrieved  at  least  one  relevant  document. 

The  queries  which  from  their  titles  seemed  to  be  of  possible  value  were  then  examined  in 
the  query  files.  Any  queries  which  were  not  actually  within  the  scope  of  the  IACs  being 
considered  for  participation,  which  did  not  appear  to  have  relevant  retrieval,  or  which  were 
too  vague,  were  rejected  at  this  point.  A  total  of  32  queries  survived  this  first  filtering. 
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All  information  about  the  query  was  recorded,  but  not  the  search  strategy.  From  this 
information,  a  statement  of  the  query  was  formulated  which  contained  all  the  relevant  terms. 
The  strategy  was  omitted  to  avoid  influencing  either  the  search  strategies  developed  for  the 
study  or  the  way  in  which  the  query  was  formulated  for  the  searchers. 

The  32  queries  were  tentatively  organized  according  to  IAC,  and  were  then  reviewed  with 
Dr.  Forrest  Frank  for  suitability.  He  reassigned  some  of  the  queries  to  different  IACs,  and 
selected  19  which  were  most  appropriate  for  searching.  These  19  queries  became  the  query 
set  for  the  study.  A  list  of  the  queries,  and  of  the  IACs  to  which  they  were  assigned  for 
relevance  judging,  may  be  found  in  Appendix  1. 


Searching 


Information  gathered 

It  was  important  to  select  an  appropriate  set  of  fields  for  downloading.  The  study  concerned 
indexing  quality,  and  it  was  not  desirable  for  a  judge’s  determination  of  relevance  to  be 
influenced  by  such  factors  as  author’s  affiliation.  Still,  it  might  be  useful  to  be  able  to 
identify  the  source  of  a  document  at  a  later  date.  Therefore,  fields  that  would  contribute 
to  a  judgment  of  relevance  on  the  basis  of  subject,  plus  fields  identifying  the  corporate 
source  and  the  data,  were  downloaded.  Downloaded  fields  are  as  follows: 

TR  Database 

I  —  DTIC  Accession  Number 

5  —  Corporate  Author 

6  —  Title 

II  —  Report  Date 
23  —  Descriptors 
25  —  Identifiers 
27  —  Abstract 

44  —  IAC  Subject  Terms 

WUIS  Database 

an  —  DTIC  Accession  Number 
rd  —  Report  Date 
ti  —  Title 
de  —  Descriptors 

poa  —  Performing  Organization  Activity  Name 

ran  —  Responsible  Organization  Activity  Name 

kw  —  Key  Words 

app  —  Approach 

obj  —  Objectives 

prg  —  Progress 
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In  the  TR  Bibliographic  Database,  Corporate  Author  and  Report  Date  were  not  provided 
to  the  relevance  judges;  in  WUIS,  the  Performing  Organization,  Responsible  Organization, 
and  Report  Date  were  not  provided.  This  information  was  omitted  in  order  to  prevent  any 
possible  influence  of  non-subject  data  on  the  relevance  judgments. 

Because  the  purpose  of  this  baseline  study  was  to  determine  the  present  quality  of  indexing, 
it  was  decided  to  accept  the  default  search  limitation  to  the  past  10  years,  rather  than 
searching  the  full  database.  After  a  number  of  searches  had  been  completed,  it  was  learned 
that  this  default  applies  only  to  the  TR  Bibliographic  Database  and  that  the  default  for  the 
WUIS  Database  is  no  time  limitation.  Therefore,  the  searches  of  the  two  databases  covered 
different  time  spans.  This  difference  was  not  seen  as  presenting  a  problem  because  the  two 
databases  were  to  be  analyzed  separately  in  any  case. 


Data  acquisition  searches 

These  searches  were  conducted  in-house  by  DTIC  staff,  who  were  instructed  to  develop 
broad  strategies  designed  to  retrieve  as  many  relevant  records  as  possible,  even  if  this  meant 
a  substantial  number  of  irrelevant  retrievals.  A  copy  of  the  introductory  materials  provided 
to  searchers  may  be  found  in  Appendix  2.  The  searchers  received  a  copy  of  "Introduction 
to  the  Study"  and  "Conducting  Searches:  Data  Acquisition  Phase."  A  total  of  six  searchers 
participated  in  this  phase. 

In  order  to  remove  the  influence  of  search  order  from  the  results,  the  order  of  the  queries 
was  randomized  for  each  searcher.  The  goal  was  to  have  each  query  searched  by  at  least 
two  searchers;  due  to  the  varying  success  rates  of  searching  and  the  effects  of  randomization, 
queries  actually  were  searched  by  from  two  to  four  searchers  each.  Assignment  of  queries 
to  individual  searchers  was  random,  with  no  relation  to  the  subject  content  of  the  searches. 

TR  and  WUIS  were  searched  consecutively  for  each  query,  since  in  many  cases  the  same 
strategy  was  applicable  to  both.  The  searchers  input  and  revised  their  strategies,  and  when 
they  were  satisfied  with  the  retrieved  set,  downloaded  the  records  including  the  specified 
fields. 

Dialup  access  was  used  so  that  records  could  be  downloaded  to  floppy  disks  for  analysis. 
A  problem  with  loss  of  data  during  transmission  was  encountered  with  downloading.  When 
the  message  "data  lost  toward  terminal"  was  encountered,  a  gap  that  might  include  several 
records  was  invariably  discovered  somewhere  earlier  in  the  download.  This  happened  quite 
consistently  with  almost  every  search  that  retrieved  a  significant  number  of  hits.  The 
problem  was  solved  —  without  ever  tracing  the  actual  cause  —  when  it  was  noted  that 
searchers  who  made  a  habit  of  storing  their  results  in  a  user  file  and  then  downloading  from 
that  file,  instead  of  displaying  search  results  directly,  never  had  data  losses  during 
transmission.  The  user  file  step  was  included  in  all  future  searches,  and  downloading  was 
successful,  without  loss  of  data. 
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In  order  to  acquire  the  missing  data,  defective  searches  were  rerun  using  the  original 
strategy.  After  verifying  that  the  strategy  was  unchanged,  the  new  retrieved  set  was 
substituted  for  the  defective  set.  Unfortunately,  the  Descriptor  field  was  omitted  from  some 
downloads  of  these  rerun  searches;  these  strategies  were  run  a  third  time,  downloading  only 
Accession  number  and  Descriptor  fields  in  order  to  save  time.  The  retrieved  records  were 
merged  with  the  earlier  records  so  that  the  record  contained  full  information. 

Even  aside  from  telecommunication  problems,  downloading  was  found  to  be  a  very  slow 
process,  reducing  significantly  the  number  of  searches  that  could  be  executed  even  by 
searchers  who  used  two  terminals  simultaneously.  The  principal  investigator  was  present 
during  search  strategy  formulation  and  running  of  the  original  searches,  but  not  during  the 
reruns. 

The  use  of  dialup  access  also  caused  problems  in  search  modification;  searchers  who  were 
accustomed  to  being  able  to  modify  and  refine  searches  in  progress  on  a  dedicated  terminal 
had  to  rekey  the  search  from  the  beginning  on  the  dialup  system  in  order  to  make  even 
minor  changes. 


Test  Searches 

The  test  searches  were  carried  out  by  experienced  searchers  at  defense  installations.  Two 
searchers  searched  each  query.  As  with  the  data  acquisition  searches,  the  order  of  searching 
was  randomized,  and  no  two  searchers  searched  exactly  the  same  queries  in  the  same  order. 
They  were  given  a  copy  of  the  introduction  to  the  study,  and  the  instructions  for  the  test 
phase,  reproduced  in  Appendix  2. 

Originally  it  was  thought  that  seven  searchers  would  be  used  for  this  phase,  but  it  was  found 
that  the  results  from  six  were  adequate  to  assure  that  each  query  was  searched  twice; 
therefore  only  the  results  from  these  six  were  used. 

The  original  goal  of  the  test  searches  was  to  determine  the  quality  of  retrieval  under 
conditions  as  close  as  possible  to  those  prevailing  in  a  real  situation.  It  was  assumed  that 
this  would  mean  a  more  focused  search,  with  some  care  taken  to  minimize  irrelevant 
retrievals,  even  at  the  price  of  less  than  maximum  recall,  because  time  pressures  would  be 
greater  and  users  would  not  want  to  scan  an  immense  quantity  of  output.  In  reality,  the 
searchers  were  in  research  situations  where  the  concern  was  to  find  as  much  relevant 
material  as  possible.  Their  searches  were  as  broad  as  the  data  acquisition  searches. 
Furthermore,  the  strategies  were  different  enough  that  overlap  was  not  very  high;  that  is, 
they  tended  to  retrieve  different  citations. 

Searchers  were  asked  to  proceed  as  if  the  search  were  a  real  one  requested  by  one  of  their 
users.  The  only  difference  was  to  be  that  user  feedback  was  unavailable.  All  of  the 
searchers  were  somewhat  uncomfortable  with  this  limitation,  because  they  correctly  regarded 
user  feedback  as  fundamental  to  the  search  process.  With  the  explanation  that  this 
limitation  was  unavoidable  in  the  study,  most  were  able  to  proceed  without  further 
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discussion.  When  searchers  posed  the  problem  of  user  differences,  particularly  in  willingness 
to  accept  a  large  retrieved  set,  they  were  asked  to  assume  a  user  preference  for  moderate 
retrieval,  i.e.,  somewhere  between  "comprehensive"  and  "a  few  good  citations." 

The  principal  investigator  was  present  while  searches  were  conducted  by  four  of  the 
searchers.  DROLS  was  down  when  the  site  visit  was  made  to  work  with  a  fifth  searcher;  he 
prepared  the  strategies  at  that  time  and  ran  them  later.  The  sixth  searcher  found  it  possible 
to  carry  out  only  a  few  searches,  as  time  permitted,  so  that  it  was  not  possible  to  be  present 
while  he  was  working. 

Most  users  did  not  have  downloading  capability.  Downloading  as  searches  were  performed 
was  possible  in  only  two  cases.  Even  these  installations  had  difficulty  because  they  rarely 
or  never  used  downloading.  Other  available  installations  had  no  downloading  facilities  at 
all. 

The  procedure  was  modified  to  permit  gathering  the  crucial  input  —  their  strategies  —  from 
the  other  test  searchers.  Dedicated  or  dialup  access  was  used,  whichever  was  most 
convenient  for  a  particular  facility.  The  searcher  developed  and  modified  the  strategy  until 
s/he  was  satisfied  that  the  retrieval  was  what  was  desired,  but  did  not  print  or  download  the 
full  list  of  hits.  The  strategy  was  recorded,  and  was  executed  by  a  trained  searcher  at  DTIC, 
using  dialup  access  and  downloading  the  hits.  The  principal  investigator  verified  that  the 
strategy  that  was  actually  searched  was  identical  to  the  strategy  developed  by  the  test 
searcher. 

Search  strategies  for  each  query  are  listed  in  Appendix  7.  Data  acquisition  searchers  are 
coded  A-F;  test  searchers  are  coded  1-6. 


Data  processing 

The  raw  search  output  was  first  run  through  word  processing.  For  each  search,  the  strategy 
was  separated  from  the  retrieved  citations.  Then  a  listing  of  the  accession  numbers  was 
generated,  and  a  count  performed  to  assure  that  if  the  accession  number  count  did  not 
match  the  citation  count  in  the  retrieved  set,  the  reason  could  be  determined.  The  reasons 
included  notes  such  as  "unannounceable  category"  or  "document  not  available."  No 
subject  data  were  available  for  such  hits,  and  they  were  excluded  from  further  analysis. 
Similarly,  Referrals  were  excluded  from  the  analysis,  because  these  are  not  "documents"  and 
the  records  do  not  contain  a  significant  amount  of  substantive  information. 

System  messages,  line  noise,  etc.,  were  deleted  from  the  citation  file,  which  was  then  sent 
to  a  data  processing  contractor  who  read  the  data  into  a  database,  permitting  all  retrieved 
fields  to  be  identified.  The  records  for  all  searches  of  a  given  query  were  merged,  retaining 
an  indication  of  the  searcher(s)  who  had  retrieved  each  citation.  In  addition  to  all  of  the 
fields  in  the  retrieved  record,  each  record  included  the  query  number,  the  database,  and  the  . 
searcher  who  retrieved  it.  Searcher  codes  were  alphabetic  for  data  acquisition  searchers  and 
numeric  for  test  searchers,  making  the  two  groups  easily  distinguishable. 
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A  deduplication  routine,  using  the  accession  number  as  the  key,  was  run  for  each  query,  and 
the  duplicates  stored  in  a  separate  file.  The  file  without  duplicates  was  used  to  generate  the 
report  for  the  relevance  judges.  Tables  1  and  2  show  the  number  of  hits  available  for 
further  analysis  (i.e.,  excluding  records  for  unavailable  documents  and  referrals)  for  data 
acquisition  and  test  searches,  total  hits,  and  mean  (average)  number  of  hits  for  each  searcher 
as  well  as  the  total  unique  (deduped)  hits  for  each  query. 


Query 

No.  of  searchers 

No.  of  hits 

Total 

Mean 

Unique 

No. 

D.A. 

Test 

D.A. 

Test 

hits 

hits 

hits 

1 

3 

2 

471 

1§2 

623 

125 

445 

2 

2 

2 

7 

7# 

19 

5 

18 

3 

2 

2 

89 

84 

173 

43 

154 

1 

2 

2 

570 

346 

916 

229 

425 

|  5 

4 

2 

127 

11 

138 

23 

130 

o 

2 

2 

159 

143 

302 

20 

179 

2 

2 

2 

101 

70 

171 

43 

112 

2 

3 

2 

231 

390 

621 

124 

560 

2 

2 

2 

264 

251 

916 

129 

369 

16 

3 

1 

174 

7 

171 

45 

176 

14 

3 

2 

11 

9 

20 

4 

9 

12 

3 

1 

102 

42 

144 

36 

112 

13 

3 

2 

296 

78 

374 

75 

186 

14 

4 

2 

128 

9 

138 

20 

112 

16 

3 

2 

26 

11 

37 

7 

26 

16 

3 

2 

34 

22 

56 

11 

26 

14 

3 

2 

257 

78 

336 

67 

265 

18 

2 

1 

504 

90 

594 

198 

351 

II  19 

3 

2 

537 

87 

624 

125 

350 

Table  1.  Number  of  Hits  by  Query:  TR  Bibliographic  Database 
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Query 

No 

No.  of  8 

earchers 

Number 

of  hits 

Total 

Mean 

Unique  U 

D.A. 

Test 

D.A. 

Test 

hits 

hits 

hits  1 

1 

3 

2 

749 

266 

1  ,  015 

203 

682 

2 

3 

2 

17 

108 

125 

25 

108 

3 

2 

2 

28 

242 

270 

68 

207 

4  . 

2 

2 

446 

308 

754 

189 

321 

5 

4 

2 

127 

27 

154 

26 

126 

6 

3 

2 

208 

131 

339 

68 

137 

7 

4 

2 

178 

63 

241 

40 

122 

8 

3 

2 

180 

238 

418 

84 

363 

9 

3 

2 

186 

283 

469 

94 

287 

10 

3 

1 

196 

41 

237 

59 

222 

11 

3 

2 

10 

6 

16 

3 

6 

12 

3 

2 

74 

46 

120 

24 

93 

13 

3 

2 

260 

204 

464 

93 

225 

14 

5 

1 

86 

31 

117 

20 

100 

15 

3 

2 

37 

65 

102 

20 

60 

16 

3 

1 

96 

27 

123 

31 

93 

17 

3 

2 

67 

96 

163 

33 

114 

18 

2 

1 

137 

12 

149 

50 

125 

19 

3 

2 

80 

57 

137 

27 

78 

Table  2.  Number  of  Hits  by  Query:  WUIS 
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After  the  retrieval  results  were  deduped,  a  random  sample  was  taken  of  the  hits  for  queries 
which  retrieved  more  than  100  unique  citations.  One  hundred  citations  per  query  was 
considered  to  be  the  maximum  that  it  would  be  appropriate  to  expect  a  relevance  judge  to 
review  in  a  relatively  short  time  —  particularly  since  most  judges  would  be  reviewing  the 
results  of  four  or  five  queries  in  each  of  the  two  databases. 


The  retrieval  pattern 

Most  searchers,  as  indicated  above,  searched  rather  broadly.  One  data  acquisition  searcher 
was  uncomfortable  with  this  practice,  and  refined  his  searches  extensively  in  an  attempt  to 
minimize  retrieval  of  irrelevant  records.  He  typically  retrieved  only  a  few  citations,  and  for 
a  number  of  queries  retrieved  none.  In  two  cases  a  test  searcher’s  strategy  for  a  query  was 
so  broad  that  it  produced  over  1000  hits,  and  no  downloading  was  attempted. 

Since  there  was  no  consistent  pattern  of  difference  between  the  data  acquisition  and  test 
searches,  the  distinction  between  them  was  abandoned  for  purposes  of  data  analysis,  even 
though  a  record  of  the  phase  in  which  a  search  was  performed  was  maintained. 


Relevance  Judging 

Searching  is  frequently  carried  out  by  generalists  for  specialists.  It  is  important  that  the 
relevance  judges  be  qualified  to  determine  the  relevance  of  a  particular  citation  to  a  query, 
and  a  reasonable  knowledge  of  the  subject  field  is  required  for  this.  For  this  reason,  subject- 
specialized  personnel  at  a  number  of  DTIC-sponsored  IACs  were  called  on  to  serve  as 
judges.  As  noted  above,  Dr.  Forrest  Frank  provided  guidance  in  LAC  selection.  Five  IACs 
—  CSERIAC,  GACLAC,  LRIA,  MTIAC,  and  SURVIAC,  were  involved,  with  two  judges  at 
each  LAC.  The  queries  are  listed  by  the  LAC  judging  them  in  Appendix  1.  The  judges  were 
selected  by  the  director  of  the  relevant  LAC. 

Each  judge  was  sent  a  package  containing  a  letter  expressing  thanks  for  participating  in  the 
project  and  briefly  explaining  the  task,  plus  a  copy  of  the  "Introduction  to  the  Study"  and 
"Relevance  Judging,"  found  in  Appendix  2.  These  materials  were  on  top  of  a  set  of 
envelopes,  each  envelope  containing  a  cover  sheet  with  the  query  number  and  query 
statement,  plus  the  citations  to  be  judged.  Judges  were  given  2  weeks  to  complete  the  task; 
most  completed  it  in  a  week  to  10  days. 

The  director  of  one  LAC  asked  to  have  both  sets  of  materials  sent  to  him;  a  staff  member 
called  to  confirm  that  the  judges  were  expected  to  work  independently.  Another  judge 
called  with  a  request  for  more  information  on  a  particular  query.  When  this  individual  was 
assured  that  the  queries  were  real  ones  that  had  been  received  at  DTIC,  and  it  was 
explained  that  the  principal  investigator  had  no  more  knowledge  of  the  query  than  had  been ' 
provided  (and  had  to  be  careful  not  to  interpret  for  fear  of  contaminating  the  results),  this 
seemed  to  resolve  the  the  problem. 
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Statistical  analysis 

The  relevance  judgments  and  codes  identifying  the  judges  were  added  to  the  citation  records. 
■Relevant’  and  "Partially  relevant’  judgments  were  combined  for  all  analyses.  Measures 
of  concurrence  between  judges  were  calculated  (Appendix  5). 

Table  3  shows  the  lambda  values  representing  the  extent  of  concurrence  between  the  judges. 
Lambda  is  a  statistical  measure,  varying  from  0  to  1,  which  is  used  to  determine  the  extent 
to  which  a  dependent  variable  can  be  predicted  from  the  value  of  an  independent  variable. 
In  this  case,  the  variables  are  the  two  judgements,  and  the  lambda  value  shows  how  well  it 
is  possible  to  predict  the  judgment  of  one  judge  if  the  judgment  of  the  other  is  known. 

Overall,  the  concurrence  of  the  judges  was  not  high,  with  a  pooled  lambda  (for  all  queries) 
of  0.21  for  TR  and  0.29  for  WUIS. 

There  is  an  interesting  variation  by  pairs  of  judges  which  is  only  partly  evident  from  the 
lambda  values  above.  The  same  pair  of  judges  evaluated  queries  1-4.  These  judges’  level 
of  agreement  on  relevance  was  fairly  high,  but  one  judge  clearly  imposed  a  higher  standard 
for  relevance,  so  that  while  one  judge  considered  nonrelevant  citations  that  the  second  judge 
considered  relevant,  the  reverse  was  not  the  case  —  the  second  judge’s  nonrelevant 
documents  were  also  nonrelevant  for  the  first  judge. 

Agreement  between  the  pair  of  judges  who  evaluated  queries  5-9,  on  the  other  hand,  was 
quite  high,  with  a  number  of  instances  of  complete  agreement  (lambda  =1.00).  Agreement 
between  the  judges  who  evaluated  queries  10-14  was  moderate  to  low,  while  agreement 
between  those  who  evaluated  15-18  and  19  was  extremely  low,  with  many  lambdas  of  0.00. 

It  should  be  noted,  however,  that  a  lambda  of  0.00  does  not  imply  complete  disagreement 
between  judges.  For  example,  on  TR  query  18,  the  judges  agreed  on  61  of  83  documents 
or  73  percent,  but  the  lambda  was  0.00;  on  WU  query  18  they  agreed  on  78  of  96,  or  81 
percent,  but  the  lambda  was  0.10.  This  situation  arises  because  lambda  gives  the 
incremental  accuracy  of  predicting  one  judge’s  opinion  from  his/her  own  history,  as  opposed 
to  basing  it  on  the  other  judge’s  data,  and  a  certain  amount  of  agreement  is  to  be  expected 
by  chance. 
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Query 

TR 

WU 

t 

0.11 

0.07 

t 

0.22 

0 . 97 

3 

0.50 

0.20 

4 

0.15 

0.01 

5 

1.00 

0.75 

6 

1.00 

1.00 

t 

1.00 

0.88 

8 

1.00 

0.94 

9 

0.91 

0.33 

16 

0.50 

0.33 

17 

0.50 

0.33 

17 

0.28 

0.17 

13 

0.16 

0.43 

14 

0.36 

0.00 

16 

0.07 

0.00 

16 

0.07 

0.00 

.  17  -  - 

0 . 07 

0.00 

18 

0.00 

0.10 

19 

0.06 

0.00 

Pooled 

0.21 

0.29 

Table  3.  Lambda  values  of  relevance  judge  concurrence 
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Recall  and  Precision  Measures 


Recall  and  precision  ratios  were  calculated  for  each  searcher  and  each  query  in  both  the  TR 
Bibliographic  Database  and  WUIS  (Appendix  6),  and  amalgamated  ratios  were  then 
calculated  for  all  searchers  of  each  query  (Appendix  6  and  Tables  4  and  5). 

Since  the  measures  of  concurrence  were  rather  low  for  most  queries,  the  analyses  of 
precision  and  recall  were  carried  out  twice  using  different  criteria  for  determining  a  relevant 
document: 

•  one  which  both  judges  concurred  in  evaluating  as  relevant  or  partially  relevant 

•  one  which  either  judge  had  evaluated  as  relevant  or  partially  relevant 

The  first  of  these  methods  is  stricter  than  the  second.  Appendix  6  shows  precision  and  recall 
ratios  by  searcher  for  each  query.  Weighted  means  are  given  by  searcher  and  by  query,  with 
95  percent  confidence  limits.  As  could  be  expected,  precision  is  higher  —  sometimes  much 
higher  —  using  the  more  generous  criterion.  Recall,  on  the  other  hand,  tends  to  be  lower. 
Tables  4  and  5  summarize  the  precision  and  recall  ratios  for  TR  and  WUIS,  respectively. 
Appendix  7  gives  the  total  number  of  hits  in  the  sample  for  each  searcher  and  each  query, 
with  the  number  of  these  which  were  relevant  by  each  of  the  two  criteria. 

Figures  1  through  4  plot  the  weighted  mean  precision  vs.  the  weighted  mean  recall  for  each 
query,  by  both  criteria.  Perhaps  the  most  striking  thing  about  these  ratios  is  that  there  is  no 
evidence  of  the  conventional  inverse  relationship  between  precision  and  recall  for  these 
queries. 

Figure  1  shows  the  ratios  for  the  TR  Bibliographic  Database,  based  on  concurrence  of 
judges.  About  half  of  the  searches  (nine)  cluster  between  recall  levels  of  .27  and  .49  with 
precision  of  .28  to  .43.  Another  nine  searches  are  scattered  at  a  higher  precision  level  of 
.54  to  .93,  with  a  scattering  of  recall  from  .18  to  .64.  Finally,  there  is  a  single  queiy  with 
recall  of.  .60  and  precision  of  .30. 

Figure  2,  showing  precision  and  recall  based  on  either  judge,  does  not  show  much  clustering. 
Recall  ranges  from  .15  to  .57  and  precision  from  .30  to  .93. 

Figures  3  and  4  provide  similar  information  for  WUIS.  Figure  3  is  similar  to  Figure  1,  with 
nine  searches  more  or  less  clustered  at  recall  levels  .27  to  .63  and  precision  of  .15  to  .34. 
The  remaining  searches  are  much  more  scattered  in  precision,  from  .42  to  .91,  but  with 
similar  recall  of  .22  to  .65. 

In  Figure  4,  precision  and  recall  based  on  either  judge,  17  of  the  19  values  are  concentrated 
at  low  to  moderate  recall  levels  of  .13  to  .38  with  a  range  of  precision  of  .22  to  .91.  The  two 
outliers  on  recall,  at  .65  and  .59,  are  also  among  the  highest  precision  values  at  .80  and  .95, 
respectively. 
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Relevance 

based  on 

Relevance 

based  on 

Query 

concurrence  of  judges 

either 

jjudpe 

Number 

Precision 

Recall 

Precision 

Recall 

2 

.64 

.31 

.38 

.26 

2 

.64 

.22 

.88 

.22 

3 

.28 

.22 

•  80 

.18 

4 

.28 

.62 

.88 

.57 

• 

.73 

•  45 

.73 

.18 

6 

.93 

.64 

.93 

.42 

7 

.93 

.38 

.48 

.38 

8 

.88 

.22 

.88 

.22 

9 

.37 

.31 

.30 

.31 

10 

.33 

.27 

.37 

.19 

11 

.3? 

.64 

.30 

.10 

1 • 

.31 

•  64 

.41 

.18 

4 

.39 

.45 

.88 

.31 

17 

.37 

.23 

.48 

.18 

15 

.39 

.38 

.54 

.25 

16 

.•8 

.64 

.73 

.31 

17 

.73 

.30 

.66 

.21 

17 

.64 

.64 

.72 

.51 

19 

.41 

.49 

.55 

.34 

Pooled 

.56 

.38 

.65 

- - 1 

Table  4.  Precision  and  Recall  Ratios:  TR  Bibliographic  Database 
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Query 

Number 

Relevance  based  on 
concurrence  of  judges 

Relevance  based  on 
either  judge 

Precision 

Recall 

Precision 

Recall 

i 

•  66 

.38 

.68 

.28 

2 

CM 

V0 

• 

.25 

.66 

.18 

3 

.32 

.39 

.37 

CM 

CM 

• 

4 

CD 

r* 

• 

.59 

.95 

.59 

5 

.81 

.22 

CM 

00 

• 

.21 

6 

.80 

.65 

.80 

.65 

7 

.91 

.36 

.91 

.35 

8 

CM 

00 

• 

CM 

• 

.83 

.24 

9 

.34 

.36 

.36 

.36 

10 

.17 

.38 

CM 

CM 

• 

.15 

11 

.31 

.63 

.44 

.38 

12 

.23 

.33 

.33 

o 

CM 

• 

13 

.51 

CM 

• 

.51 

.32 

14 

.20 

CM 

• 

.29 

.12 

15 

.25 

.56 

.33 

.16 

16 

.15 

.43 

.32 

.13 

j 

.42 

.47 

.54 

.18 

18 

.67 

.40 

.73 

.34 

19 

.29 

.38 

.45 

r* 

CM 

» 

Pooled 

.53 

.39 

.60 

r* 

CM 

• 

Table  5.  Precision  and  Recall  Ratios:  WUIS 
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Precision  and  Recall  Ratios 
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Figure  1 
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Precision  and  Recall  Ratios 
TR  Bibliographic  Database:  Either  Judge 
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Figure  2 
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Precision  and  Recall  Ratios 
WUIS:  Concurrence  of  Judges 
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Figure  3 
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Precision  and  Recall  Ratios 
WUIS:  Either  Judge 
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Figure  4 
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It  is  interesting  to  examine  an  example  of  the  factors  affecting  precision  and  recall  by 
different  searchers  on  the  same  query.  Query  6  in  the  TR  Bibliographic  Database  was 
selected  as  an  illustration.  Recall  and  precision  by  concurrence  of  judges  and  by  either  judge 
were  identical.  Of  95  unique  hits,  91  were  judged  relevant. 


Searcher  A: 
%collision  avoid 
iacs=%collision  avoid 
and 

%waming  sys 
iacs=%waming  sys 


4  hits 

P=.25  (of  4  hits,  1  was 
relevant) 

R=.01  (of  91  relevant  in 
sample,  1  retrieved) 


Searcher  F: 
iacs=%collision  avoid 
iacs=%aircraft  collision  av 
iacs=%airbome  collision 
collision  avoidance 
%collision  avoidance  sys 
%aircraft  collision  av 
%airbome  collision  av 
%aircraft  cas(collis 


86  hits 

P=.93  (of  86  hits,  82  were 
relevant) 

R=.90  (of  91  relevant  in 
sample,  82  retrieved) 


Searcher  1: 
%collision  avoidance 
%collision  warning 
and 

%  aircraft 
%airplane 


29  hits 

P=.93  (of  29  hits,  27  were 
relevant) 

R=.30  (of  91  relevant  in 
sample,  27  retrieved) 


Searcher  6:  45  hits 

collision  avoidance  P=.93  (of  45  hits,  42  were 

and  relevant) 

?60%system  R=.46  (of  91  relevant  in 

sample,  42  retrieved) 

Searcher  F  achieved  both  high  recall  and  high  precision  on  this  search,  apparently  as  a  result 
of  searching  on  both  DTIC  and  IAC  terms,  and  of  using  key  terms  beginning  with  the  words 
"aircraft...*  and  "airborne..."  that  are  not  in  the  thesaurus,  but  may  have  been  assigned  as 
identifiers.  Use  of  "AND"  logic  reduced  recall  by  the  other  searchers.  There  are  other 
factors,  however,  such  as  use  or  nonuse  of  truncation  (the  *%"  sign)  that  may  also  have 
affected  recall. 


The  small  number  of  searcher  A’s  hits  that  were  included  in  the  sample  implies  that  one 
cannot  draw  firm  conclusions  about  the  low  precision  and  recall  of  this  search;  precision  of 
the  other  searchers  was  uniformly  high. 

In  Phase  Two  of  this  study,  individual  citations  will  be  examined  to  determine  why  relevant 
citations  were  not  retrieved  or  nonrelevant  citations  were  retrieved  by  each  search  of  a 
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query.  This  retrieval  failure  analysis  will  provide  specific  information  about  the  factors  in 
indexing  that  facilitate  or  hinder  retrieval.  Originally  it  was  intended  to  conduct  a 
preliminary  failure  analysis  as  part  of  Phase  One.  The  delays  and  difficulties  encountered 
in  data  gathering  made  this  impractical,  and  all  failure  analysis  was  rescheduled  for  Phase 
Two. 


Conclusions 

The  queries  which  were  searched  for  this  study  were  relatively  simple  ones.  They  were 
limited  to  subject  information,  and  usually  were  on  relatively  broad  topics.  Yet  the  variation 
in  search  strategies  was  great,  not  so  much  in  actual  choice  of  concepts  to  be  searched  as 
in  refinements  such  as  use  of  truncation,  inclusion  of  IAC  terms,  and  searching  of  words  in 
titles  or  abstracts. 

The  design  of  the  indexing  scheme  is  implicated  in  some  of  this  variation.  For  instance,  the 
need  for  explicit  inclusion  of  IAC-assigned  terms  in  a  search  can  lead  to  some  loss  of 
information,  because  some  IAC-originated  documents  have  no  terms  in  the  DTIC-assigned 
indexing  field  (23),  and  searchers  may  neglect  to  include  the  IAC  term  field  (44)  in  a  search. 
Similarly,  hierarchy  searching  can  only  be  as  good  as  the  hierarchies  themselves.  On  the 
other  hand,  truncation  capabilities  and  the  necessity  of  using  a  separate  step  (the  @qsrtab@ 
or  @srtab@  command)  to  search  abstracts  are  features  of  the  search  system. 

Since  this  study  was  intended  to  evaluate  indexing,  not  searching  or  searchers,  the  overall 
average  precision/recall  ratios  for  each  query  are  more  significant  than  the  ratios  for 
individual  searchers.  The  most  interesting  point  about  these  ratios  is  their  failure  to  show 
the  conventional  inverse  relationship  between  recall  and  precision. 

Another  striking  point  is  that,  even  though  the  method  used  to  determine  the  base  of 
relevant  documents  was  necessarily  quite  limited,  none  of  the  queries  had  a  very  high  mean 
recall.  The  relevant  documents  in  this  study  are  a  subset  of  those  that  were  retrieved  by  the 
searches  carried  out  for  the  study;  there  was  no  feasible  way  to  determine  how  large  a 
proportion  this  subset  is  of  all  the  documents  in  the  database  which  are  relevant  to  a  query. 
That  is,  we  do  not  know  how  many  relevant  documents  were  not  retrieved  by  any  of  the 
searchers.  Despite  this  limitation,  which  would  bias  the  results  toward  higher  apparent  recall 
than  was  actually  the  case,  the  mean  recall  was  not  extremely  high  for  any  of  the  queries. 

However,  inspection  of  the  search  strategies  shows  that  searchers,  even  though  they 
frequently  searched  on  the  same  terms,  used  a  number  of  different  capabilities  of  the  search 
system  —  but  did  not  use  them  all,  even  when  asked  to  search  comprehensively.  The  number 
of  different  permutations  of  hierarchy,  truncation,  and  different  subject  term  fields,  plus  the 
unavailability  of  narrative  text  fields  in  the  TR  Bibliographic  Database,  except  for  separate 
qualification  searching,  make  it  difficult  to  devise  the  optimum  strategy. 

Even  though  specifics  cannot  be  determined  at  this  time,  it  is  reasonable  to  conclude  that 
improvements  to  both  indexing  and  the  search  engine  are  warranted.  The  difficulties 


23 


encountered  in  downloading  indicate  that  there  are  problems  somewhere  in  the 
telecommunications  system  as  well.  One  improvement  in  searcher  training  can  also  be 
suggested:  that  descriptor  searches  regularly  take  account  of  field  44  (IAC-assigned  terms), 
to  assure  that  IAC-originated  documents  are  retrieved  whenever  appropriate. 

Phase  Two  will  provide  more  definitive  answers  as  to  the  causes  of  the  retrieval  inadequacies 
encountered,  at  least  for  the  documents  which  were  actually  retrieved  by  at  least  one 
searcher.  Those  which  remain  unknown,  of  course,  cannot  be  evaluated.  In  Phase  Two,  a 
sample  of  retrieved  documents  will  be  evaluated  for  each  query.  The  reasons  for  failure  of 
a  given  search  to  retrieve  a  relevant  document,  or  for  its  retrieval  of  a  nonrelevant 
document,  will  be  recorded  and  categorized.  It  is  anticipated  that  there  will  be  two  broad 
categories  —  indexing  failures  and  search  failures  —  with  subcategories  such  as  vocabulary 
inadequacy  (indexing)  or  failure  to  truncate  (search). 

While  these  analyses  will  contribute  to  the  long-run  goal  of  developing  the  new  system,  they 
should  also  suggest  some  concrete  improvements  to  indexing  and  searching  that  could  be 
included  in  training  programs  for  use  of  the  present  system. 

Also  in  Phase  Two,  a  philosophy  of  subject  indexing  will  be  developed.  This  effort  will 
involve  extensive  consultation  with  units  such  as  subject  analysis  within  DTIC,  as  well  as  with 
users  of  the  DTIC  databases. 
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Appendix  1 

Queries  in  the  Sample 


CSERIAC 

1.  Flight  control  and  instrumentation:  including  displays  and  related  topics. 

2.  +Gz  acceleration:  electrophysiological  effects  of  cardiac  arrhythmias  and  dysrhythmias 
on  subject  undergoing  +Gz  acceleration. 

3.  Technologies  to  mitigate  physiological  effects  of  fatigue:  Non-pharmacological  and 
non-invasive  methods  to  aid  in  sleep  promotion  or  induction.  Specifically  auditory  or 
deep  muscle  relaxation  strategies  for  sleep  or  reduction  in  environmental  distractions. 

4.  Helmet  mounted  displays,  including  Kaiser-Agile  Eye,  Polhemus-Magnatrak,  General 
Dynamics-Falcon  Eye,  Electro  optics  Industry-Knight’s  Eye,  and  Elbit’s-Dash 


GACIAC 

5.  Passive  ranging  from  passive  sensors  in  an  airborne  environment 

6.  Collision  avoidance  systems 

7.  Neural  networks  in  automatic  target  recognition 

8.  Navigation  and  guidance,  target  detection,  range  and  position  finding 

9.  Navigation  detection  countermeasures,  including  radar  countermeasures,  optical 
detection  &  detectors,  infrared  detection  &  detectors. 


IRIA 

10.  Infrared  surveillance,  wide  area  surveillance,  clutter,  ground  targets,  camouflage, 
weighted-difference  algorithms,  dual  band  IR,  color  ratios,  multi-band  IR 

11.  Infrared  detectors,  especially  for  drug  and  narcotics  detection 

12.  Visual  models:  visual  detection  and  acquisition  of  target  of  military  vehicles  (ground); 
camouflage  (visual)  of  military  vehicles;  color  contrast;  motion  detection  (visual 
detection). 


Al-1 


13.  Night  vision,  thermal  imagery,  electro  optical,  FLIR,  forward  looking  infrared,  sighting 
devices,  day/night  sight,  night  vision. 

14.  Beach  reconnaissance:  Minefield  detection  by  means  of  imaging,  including  automatic 
target  recognition. 


MTIAC 

15.  Fastener  coating  -  IVD:  coatings,  aluminum,  IVD  aluminum  coatings;  ion  vapor 
deposition;  coatings,  metal;  corrosion  protection;  fastener. 

16.  Grenade  Assembly:  M77  grenade  assembly, weapons  systems  assembly  process;  safety 
pin  removal  process  in  grenades  for  weapons  systems. 

17.  Neural  networks  in  manufacturing:  Neural  networks  in  composites  manufacturing; 
including  reports  dealing  with  neural  net  algorithms,  learning  techniques, 
implementation  issues,  integration  of  neural  nets  within  expert  systems  &  different 
application  areas,  especially  in  manufacturing. 

18.  Work  measurement,  process  improvement,  etc.  How  to  measure  work  or  processes, 
etc. 


SURV1AC 

19.  Radar  warning  receivers  and  their  use  by  helicopters  and  fixed  wing  aircraft. 


Al-2 


Appendix  2 
Introductory  Materials 

Introduction  to  the  Study 

The  purpose  of  this  study  is  to  determine  the  baseline  quality  of  DTIC  indexing  by  studying 
retrieval  from  the  DTIC  databases.  The  findings  will  be  used  to  make  any  improvements 
to  the  indexing  system  which  may  be  warranted.  The  study  is  designed  as  a  measurement 
of  recall  and  precision,1  and  will  consist  of  two  phases:  a  data  acquisition  phase  and  a  test 
phase. 

In  the  data  acquisition  phase,  a  representative  group  of  questions  will  be  selected  from  the 
DTIC  query  log.  A  group  of  experienced  DTIC  searchers  will  search  the  questions 
exhaustively,  attempting  to  locate  as  many  relevant  citations  as  possible,  without  considering 
how  many  irrelevant  citations  may  be  retrieved.  The  10-year  default  will  be  accepted,  so 
that  records  from  the  past  10  years  will  be  retrieved.  This  will  limit  retrieval  to  the  relatively 
recent  past,  while  assuring  that  a  reasonably  broad  range  of  material  is  retrieved. 

The  purpose  of  this  phase  is  to  approach  as  closely  as  possible  the  retrieval  of  all  records 
in  the  databases  which  are  relevant  to  a  given  query  and  for  which  an  unclassified  document 
is  retrievable.  Since  it  is  not  practical  to  evaluate  every  record  in  the  database  for  relevance 
to  each  query,  this  approach  of  exhaustive  searching  will  make  it  possible  to  have  a  large 
base  for  comparison  with  . retrieval  in  the  test  phase.  The  relevant  records  retrieved  in  the 
data  acquisition  phase,  plus  any  additional  ones  found  in  the  test  phase,  will  be  treated  as 
if  they  were  all  of  the  relevant  records  to  be  found  in  the  database.  This  procedure  is 
typical  of  those  followed  in  studies  of  recall  and  precision  on  large  databases. 

In  the  test  phase,  experienced  search  intermediaries  from  the  DTIC  user  community  will 
perform  more  typical  searches  on  the  same  questions,  attempting  to  produce  results  that 
might  be  useful  for  the  requester  —  i.e.,  with  a  reasonable  balance  between  recall  and 
precision  in  retrieval. 

Neither  group  of  searchers  will  carry  out  any  post-processing  or  evaluation  of  individual 
citations  for  relevance;  the  entire  final  retrieved  set  will  be  submitted  for  relevance 
evaluation.  Subject-matter  experts  based  at  DTIC  IAC’s  will  evaluate  the  relevance  to  the 
query  of  each  record  retrieved,  permitting  precision  and  recall  ratios  to  be  determined  for 
the  test  searches. 

The  failures  of  recall  (relevant  records  not  retrieved)  and  precision  (irrelevant  records 
retrieved)  in  the  test  phase  will  be  analyzed  to  determine  the  reasons  for  the  failures.  Those 


1  Precision  is  the  percentage  of  documents  retrieved  which  is  actually  relevant  to  the 
query.  Recall  is  the  percentage  of  the  total  relevant  documents  in  a  collection  which  is 
actually  retrieved  by  a  query. 
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failures  which  represent  indexing  problems  will  then  become  the  subject  of  recommendations 
for  improvement  of  the  indexing  system  in  the  next  phase  of  the  effort. 

While  multiple  searchers  will  be  searching  each  question,  the  purpose  of  the  study  is  not  to 
examine  searcher  performance.  For  most  of  the  analysis,  the  work  of  all  searchers  for  a 
given  question  in  each  phase  will  be  amalgamated;  even  when  retrieval  of  individual 
searchers  is  compared,  the  purpose  will  be  to  isolate  searcher  differences  as  a  variable  for 
statistical  analysis. 
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Conducting  Searches 
Data  Acquisition  Phase 


The  purpose  of  this  phase  is  to  gather  data  comprising  as  many  as  possible  of  the  relevant 
documents  to  be  found  in  the  database,  even  at  the  price  of  retrieving  a  large  number  of 
irrelevant  citations.  These  relevant  documents  will  serve  as  the  basis  for  determining  the 
recall  ratios  of.  more  typical  searches,  conducted  during  the  test  task.  The  indexing  system 
is  being  evaluated  in  this  study,  not  searcher  performance. 

Please  conduct  the  searches  in  the  order  designated.  These  are  all  queries  which  have  been 
submitted  to  DTIC  and  searched  by  DTIC  searchers  in  the  recent  past.  All  of  the  available 
information  about  the  query  has  been  provided,  but  not  the  search  strategy  which  was  used 
in  the  original  search.  Since  the  original  search  was  conducted  under  uncontrolled 
conditions,  it  is  not  relevant  to  the  study,  and  it  should  not  be  permitted  to  influence  the 
design  of  your  strategy. 

This  is  not  a  test  of  searching  speed,  and  you  may  not  finish  all  the  searches.  The  order  of 
the  searches  has  been  randomized,  so  that  each  searcher  searches  the  queries  in  a  different 
order,  but  each  query  will  be  searched  at  least  three  times  if  each  searcher  searches  10  or 
more  queries.  Ignore  the  number  beside  each  query;  it  is  there  for  coding  purposes  only. 

For  each  query,  devise  a  broad  search  strategy  on  the  general  topic,  designed  for  maximum 
recall.  You  may  refine  and  reformulate  the  strategy  as  much  as  you  wish,  until  you  are 
convinced  that  you  are  probably  retrieving  as  many  as  possible  of  the  documents  in  the 
database  which  are  relevant  to  the  query.  Search  both  the  Technical  Report  (TR)  and 
Work  Unit  (WUIS)  databases,1  modifying  the  strategy  as  appropriate  to  maximize  relevant 
retrieval  from  each.  Limit  retrieval  to  unclassified  limited  documents  —  i.e.,  documents  for 
which  an  unclassified  abstract  is  available,  and  use  the  10-year  default  to  retrieve  documents 
only  from  the  past  10  years. 

When  the  search  is  complete,  print  your  results  in  reverse  order  by  date  (i.e.,  latest  date 
first),  together  with  the  log  of  the  search.  Also,  download  the  results  to  floppy  disk.  No 
post-processing  of  the  results  should  be  carried  out;  we  are  interested  in  the  output  of  the 
search  system,  rather  than  in  human  ability  to  compensate  for  its  precision  failures.  List  on 
the  query  sheet  any  additional  information  you  think  may  be  helpful.  In  particular,  since  this 
is  a  test  of  subject  indexing  quality,  please  give  your  reasons  for  any  use  of  full  text  in  your 
search.  For  example,  did  you  use  full  text  because  the  concept  was  too  new  to  be  in  the 
thesaurus,  or  because  it  was  narrower  than  any  available  thesaurus  term? 


1  Depending  on  time  limitations,  in  the  afternoon  of  the  second  day  you  may  be  asked 
to  concentrate  on  the  TR  database,  rather  than  searching  both  TR  and  WUIS. 
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Conducting  Searches 
Test  Phase 


The  purpose  of  this  phase  is  to  retrieve  from  the  database  a  set  of  documents  that  would 
be  useful  to  the  person  posing  a  query.  The  indexing  system  is  being  evaluated  in  this  study, 
not  searcher  performance. 

You  will  be  assigned  one  or  more  queries  to  be  searched.  Please  conduct  the  searches  in 
the  order  designated.  These  are  all  queries  which  have  been  submitted  to  DTIC  and 
searched  by  DTIC  searchers  in  the  recent  past.  You  will  be  provided  all  of  the  available 
information  about  the  query,  but  will  not  be  provided  the  search  strategy  which  was  used  in 
the  original  search.  Since  the  original  search  Was  conducted  under  uncontrolled  conditions, 
it  is  not  relevant  to  the  study,  and  it  should  not  be  permitted  to  influence  the  design  of  your 
strategy. 

For  the  queries  assigned  to  you,  please  devise  a  search  strategy  designed  to  produce  what 
seems  to  you  like  a  reasonable  balance  between  precision  and  recall.  That  is,  you  should 
attempt  to  retrieve  as  many  documents  as  possible,  without  burdening  the  user  with  an 
unreasonable  number  of  irrelevant  documents.  You  may  refine  and  reformulate  the  strategy 
as  much  as  you  wish,  until  you  have  spent  what  seems  to  you  to  be  the  amount  of  time  you 
would  spend  on  a  typical  search  of  this  nature.  Search  both  the  Technical  Report  (TR)  and 
Work  Unit  (WUIS)  databases,  modifying  the  strategy  as  appropriate  to  maximize  relevant 
retrieval  from  each.  Limit  retrieval  to  documents  for  which  an  unclassified  record  is 
available,  and  use  the  10-year  default  to  retrieve  documents  only  from  the  past  10  years. 

Since  there  has  been  a  lapse  of  time  since  the  data  acquisition  searches  were  run,  you  will 
need  to  exclude  items  added  to  the  database  after  the  date  of  the  data  acquisition  search 
for  a  given  query.  The  sheet  for  each  query  indicates  the  cutoff  date  for  that  query. 

When  the  search  is  complete,  print  your  results  in  reverse  order  by  date  (i.e.,  latest  date 
first),  together  with  the  log  of  the  search.  Also,  download  the  results  to  a  floppy.  No  post¬ 
processing  of  the  results  should  be  carried  out;  we  are  interested  in  the  output  of  the  search 
system,  rather  than  in  human  ability  to  compensate  for  its  precision  failures.  List  on  the 
query  sheet  any  additional  information  you  think  may  be  helpful.  In  particular,  since  this 
is  a  test  of  subject  indexing  quality,  please  give  your  reasons  for  any  use  of  full  text  in  your 
search.  For  example,  did  you  use  full  text  because  the  concept  was  too  new  to  be  in  the 
thesaurus,  or  because  it  was  narrower  than  any  available  thesaurus  term? 


Relevance  Judging 


Determination  of  the  relevance  to  the  query  of  each  citation  retrieved  by  that  query  is 
required  to  permit  determination  of  recall  and  precision  ratios.  The  total  number  of 
citations  retrieved  for  a  query  is  sometimes  very  large.  Forjudging  purposes,  an  upper  limit 
of  100  citations  for  each  version  (TR  and  WUIS)  of  each  query  has  been  established.  When 
the  total  retrieval  was  above  this  number,  a  sample  of  100  citations  has  been  selected  for 
relevance  judging.  The  purpose  of  this  limitation  is  to  keep  the  effort  required  of  the  judges 
within  reasonable  bounds. 

Please  review  each  citation  and  abstract  and  make  a  decision  about  its  relevance  to  the 
query,  on  a  four-point  scale:  Relevant  /  Partially  or  probably  relevant  /  Not  relevant  /  Not 
determinable. 

Relevant:  A  citation  which  bears  directly  on  the  query;  one  which  seems  very  likely  to 
contain  significant  information. 

Partially  or  probably  relevant:  A  citation  which  contains  useful  information  which  is  not 
central  to  the  query,  or  one  which  seems  likely  to  contain  useful  information. 

Nonrelevant:  A  citation  which  seems  very  unlikely  to  contain  information  of  value  to  the 
query. 

Not  determinable:  A  citation  for  which  you  cannot  determine  from  the  information  available 
whether  it  is  likely  to  be  relevant. 

Your  initials  should  be  placed  in  the  "Judge"  blank  at  the  top  of  the  form. 


A2-5 


TPP’T 


Appendix  3 

Search  Strategies1 

01  TR 


Searcher  A 
@str@ 

iacs=%flight  control 
%  flight  control 
{flight  control 
and 

{flight  instruments 
%  flight  instrumen 
iacs= flight  instrumen 
end 


Searcher  C 

@str@ 

{^flight  control  systems 
%flight  display 
flight  instruments 
and 

instrumentation 
{flight  instruments 
end 


Searcher  E 

@str@ 

%flight  control 
{flight  control  systems 
and 

%  instrument 
end 


Searcher  3 

@str@ 

•flight  control  systems 

?60flig 

•?0Qflight 

and 

•flight  control  sytems 

?60control 

•?00control 

and 

•instrumentation 

?60%instrument 

•?00%instrument 

end 


Searcher  5 

@str@ 

{%flight  control 
{%aircraft  control 
and 

{%instrument 

{%display 

end 


1  When  the  *@swuwps@*  command  was  used  to  search  WUIS  with  the  same  strategy  used  for 
TR,  the  command  is  shown,  but  the  strategy  is  repeated  for  the  convenience  of  the  reader  of  this 
report. 
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01  wu 


Searcher  A 

@swu@ 

%flight  control 
de=$flight  control 
and 

de=$flight  instruments 
%flight  instrumen 
end 

Searcher  C 

@swu@ 

$flight  control  systems 
flight  instruments 
%flight  display 
and 

instrumentation 
Sflight  instruments 
end 


Searcher  5 

@swu@ 

sub=$£A  ,ht  control 
sub=$%aircraft  control 
and 

sub = $%instrument 

sub=$%display 

end 


Searcher  E 

@swu@ 

%flight  control 
de= flight  control  systems 
sub = flightcontrol 
and 

%instrument 

end 


Searcher  3 

@swuups@ 

*  flight  control  systems 

?60flig 

*?00flight 

and 

•flight  control  sytems 

?60control 

•?00control 

and 

•instrumentation 

?60%instrument 

*?00%instrument 

end 
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02  TR 


Searcher  A 

@str@ 

arrhythmia 

%arrhythmia 

and 

electrophysiology  . 
end 

@srtab@ 

acceleration 

gz 

end 


Searcher  C 

@str@ 

+gz 

%gz 

end 

@qsrtab@ 

acceleration 

end 


Searcher  5 

@str@ 

?60+gz 

$%+gz 

$%acceleration 

and 

$%cardiac 

$  %electrophysiology 
end 


Searcher  1 

@str@ 

$acceleration 

and 

%cardiac 

%heart 

and 

%e£fect 

?60%effect 

?00%effect 

end 
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Searcher  A 

@swu@ 

arrhythmia 

arrhythmias 

kw=dysrhythmia  during  acceleration 
and 

de =electro  physiology 
sub =electrophysiology 
nar = electrophysiology 
end 


Searcher  C 

@swu@ 

%+gz 

%gz 

and 

%arrhythmia 
%cardiac  arrhythmia 
%dysrhythmia 
end 


Searcher  D 

@swu@ 

%arrhythmi 

%dysrhythmi 

and 

cardi 

heart 

hearts 

end 

@srtab@ 

acceleration 

accelerating 

accelerate 

accelerates 

accelerated 

accelerations 

end 


Searcher  1 

@swu@ 

de= cardiac 

de=heart 

kw=%cardiac 

de=%cardiac 

kw= heart 

%cardiac 

%heart 

and 

$acceleration 

and 

de=%effect 

kw=%e£fect 

%effect 

end 

Searcher  5 

@swu@ 

sub=$%+gz 

sub = $%acceleration 

and 

sub=$%cardiac 

sub = $  %electro  physiology 

end 
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Searcher  C 

Searcher  2 

@str@ 

@str@ 

fatigue 

$sleep 

fatigue(physiology) 

$sleep  deprivation 

and 

and 

sleep 

{relaxation 

sleep  deprivation 

{muscles 

end 

{hearing 

Searcher  F  —  3  strategies  combined  in  user 

%bioacoust 
{sound  generators 
%noise  mask 

file 

?60%masking 

@str@ 

fatigue(physiology) 

and 

?00%masking 

Searcher  4 

%sleep  promot 
sleep  induction 

@str@ 

muscle  relaxation 

fatigue(physiology) 

deep  muscle  relaxation 

{sleep  deprivation 

muscle  relaxants 

and 

sleepability 

{relaxation 

relaxation(physiology) 

%environmental  disturbance 

end 

disturbance 

disturbances 
?60relaxation 
?60%disturbance 
?60%nonpharma 
?60%non-pharm 
?60%noninvasive 
?60%non-invasive 
{environments 
?60%environment 
end 

fatigue(physiology) 

?60%fatigue 

and 

%sleep  promo 
%sleep  indue 
%muscle  relax 
relaxation(physiology) 
end 


@str@ 

?60relaxation 

?60%sleep 

and 

?60%promot 

?60%induce 

end 

@str@ 
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Searcher  C 

@swu@ 

fatigue 

fatigue(physiology) 

and 

sip 

sleep  deprivation 
sleep  disturbances 
sleep  loss 
sleep  patterns 
sleep  disorders 
%sleep  cycle 
end 

Searcher  F  —  3  strategies  combined  in  user 
file 

@swu@ 

fatigue(physiology) 

and 

%sleep  promot 
sleep  induction 
muscle  relaxation 
deep  muscle  relaxation 
muscle  relaxants 
sleepability 
relaxation(physiology) 
end 

@swu@ 
ti= relaxation 
ti=%sleep 
and 

ti=%promot 

ti=%induc 

end 

@swu@ 

fatigue(physiology) 

ti=%fatigue 

and 

%sleep  promo 
%sleep  indue 
%muscle  relax 
relaxation(physiology) 
end 


Searcher  2 

@swu@ 

$fatigue(physiology) 
$fatigue  (physiology) 
and 

(physiological  effects 

%stress 

%physiolog 


Searcher  4 

@swu@ 

fatigue(physiology) 

Ssleep  deprivation 
and 

(relaxation 

%environmental  disturbance 

disturbance 

disturbances 

ti=relaxation 

ti = %disturbance 

ti+%nonpharma 

ti=%nonpharma 

ti=%non-pharm 

ti = %noninvasive 

ti = %non-invasive 

(environments 

ti = %environment 

end 
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Searcher  E 
@str@ 

?00%helmet  mounted  display 
end 


Searcher  F 

@str(a> 

%helmet  mounted  displ 
%head  up  displ 
%heads  up  displ 
%HUD(head 
%HUDS(head 
iacs=helmet  mounted 
iacs= helmet  mounted  display 
iacs=%helmet  mounted  d 
iacs=%helmet-mounted  di 
iacs=%hud(head 
iacs=%huds(head 
end 


Searcher  4 

@str@ 

%  helmet  mounted  display 
%head  mounted 
%  head-mounted 
%hmd% 

%hmd(head 
%hmd  (helmet 
%  agile  eye 
%  falcon  eye 
end 


Searcher  3 

helmet  mounted  displays 

?00helmet 

?60helmet 

and 

helmet  mounted  displays 

?00mount 

?60mount 

and 

helmet  mounted  displays 

?00%display 

?60%display 

end 
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Searcher  E 

@swu@ 

de=%helmet  mounted  displays 
hmd 

sub=helmet  mounted  display 
kw=helmet  mounted  display 
end 

Searcher  F 

@swu@ 

%helmet  mounted  displ 

%head  up  displ 

%heads  up  displ 

%hud(head 

%huds(head 

iacs= helmet  mounted 

iacs=helmet  mounted  display 

iacs=%helmet  mounted  d 

iacs=%helmet  mounted  di 

iacs=%hud(head 

iacs=%huds(head 

end 


Searcher  3 

@swu@ 

helmet  mounted  displays 

?00helmet 

?60helmet 

and 

helmet  mounted  displays 

?00mount 

?60mount 

and 

helmet  mounted  displays 

?00%display 

?60%display 

end 


Searcher  4 

@swu@ 

%helmet  mounted  display 
%head  mounted 
%head-mounted 
%hmd% 

%hmd(head 
%hmd(helmet 
%  agile  eye 
%falcon  eye 
end 
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05  TR 


Searcher  A 

Searcher  E 

@str@ 

@str@ 

(passive  systems 

Passive  ranging 

and 

end 

(range  finding 
end 

@srtab@ 

air 

Searcher  3 

@str@ 

airborne 

?00%sensor 

aerospace 

?60%sensor 

end 

and 

Searcher  B 

airborne 

?60airborne 

end 

@str@ 

@srtab@ 

(detectors 

passive 

%passive  ranging 
%passive  sensor 
and 

end 

(aerospace  environments 
%airborne  environment 

Searcher  5 

end 

@str@ 

(%passive  ranging 

@srtab@ 

(%passive  sensor 

passive  sensors 

and 

ranging 

(%airbome 

airborne 

(%aircraft 

passive  sensor 
end 

end 

Searcher  C 

@str@ 

passive  systems 

%passive 

and 

(detectors 

%sensors 

and 

range  finding 

( 

and 

airborne 

end 
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Searcher  A 

@swu@ 

%passive  system 
de=passive  systems 
and 

$range  finding 
%range  find 
end 

@srtab@ 

air 

airborne 

aerospace 

end 


Searcher  B 

@swu@ 

$detectors 
%passive  ranging 
%passive  sensor 
and 

$  aerospace  environments 
%airbome  environment 
end 

@srtab@ 
passive  sensors 
ranging 
airborne 
passive  sensor 
end 


Searcher  C 

@swu@ 
passive  systems 
%passive 
and 

$detectors 

%sensors 

and 

$range  finding 
and 

airborne 

end 


Searcher  E 

%passive  ranging 
%passive  sensors 
end 


Searcher  3 

@swu@ 
kw=passive 
sub = passive 
%passive  ranging 
and 

%passive  ranging 
kw= ranging 
sub = ranging 
and 

kw=airbome 
sub = airborne 
%  airborne 
end 


Searcher  5 

@swu@ 

sub=$%passive  ranging 
sub=$%passive  sensor 
and 

sub =$%  airborne 

sub=$%aircraft 

end 
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Searcher  A 

@str@ 

?60cas 

@str@ 

^collision  avoid 

end 

iacs=%collision  avoid 

@qsrtab@ 

and 

collision 

%waming  sys 

collisions 

iacs=%waming  sys 

collision  avoidance 

end 

collide 

collided 

aircraft  collision 

Searcher  B 

aircraft  collisions 
airborne  collision 

@str@ 

airborne  collisions 

collision  avoidance  systems 
collision  avoidance 
and 

end 

collision 

collision  avoidance 

Searcher  1 

end 

@str@ 

^collision  avoidance 
%collision  warning 

Searcher  F  —  3  searches  combined  in  user 

and 

file 

%  aircraft 
%airplane 

@str@ 

?60avoidance 

and? 

end 

60%collision 

and 

Searcher  6 

?60system 

@str@ 

?60systems 

collision  avoidance 

end 

and 

?60%system 

@str@ 

collision  avoidance 
%collision  avoidance  sys 
%aircraft  collision  av 
%  airborne  collision  av 
%aircraft  cas(collis 
end 

end 
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Searcher  A 

@swu@ 

%  collision  avoid 
and 

%waming  sys 
%proximity  warn 
end 


ti= helicopters 
ti=gunship 
ti=gunships 
ti= tanker 
ti= tankers 
ti= refuel 
ti= refueling 
end 


Searcher  B 

@swu@ 

collision  avoidance  systems 
collision  avoidance 
avoidance 
and 

collision 

collision  avoidance 
end 


Searcher  F  —  3  searches  combined  in  user 
file 

@swu@ 

%  collision  avoid 
%aircraft  collision  av 
%  airborne  collision  ab 
%airbome  collision  av 
%CAS(collision 
end 


@swu@ 

ti=cas 

sub=cas 

sub = collision  avoidance 
end 


Searcher  1 

@swu@ 

%collision  avoidance 
^collision  warning 
%aircraft  guidance  system 
de=%collision  avoidance 
kw=%collision  avoidance 
de=%collision  warning 
kw=%collision  warning 
de=%aircraft  guidance  system 
kw=%  aircraft  guidance  system 
end 


Searcher  6 


@swu@ 
ti= avoidance 
ti= avoiding 
ti= avoid 
ti= avoided 
and 

ti= collision 
ti= collide 
ti= collides 
ti= collided 
ti= colliding 
and 

ti= aircraft 
ti= airborne 
ti= helicopter 


@swu@ 

collision  avoidance 
end 


A3-12 


07  TR 


Searcher  A 

@str@ 

%neural  net 
iacs=%neural  net 
and 

%target  recog 
%target  det 

%automatic  target  recog 
%automatic  target  det 
iacs=%target  recog 
iacs=%target  det 
iacs=%automatic  target  recog 
iacs=%automatic  target  det 
end 


Searcher  E 

%neural  network 
and 

Starget  recognition 
end 


Searcher  5 

@str@ 

$%automatic  target 
and 

$%neural  network 
$%artificial  intelligence 
end 


Searcher  6 

@str@ 
neural  nets 
and 

target  recognition 
end 
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07  WU 


Searcher  A 

@swu@ 

%neural  net 
and 

%target  recog 
%  target  de 

%  automatic  target  recog 
%automatic  target  de 
end 


Searcher  B 

@swu@ 

$networks 
neural  nets 
and 

Jnetworks 
neural  nets 
and 

target  recognition 
automatic  target  recognition 
end 


@swu@ 

%neural  net 
and 

%atr(auto 

%automatic  target  recog 
end 


Searcher  E 


%neural  network 
and 

$target  recognition 
end 


Searcher  5 

@swu@ 

sub=$%automatic  target 
and 

sub=$%neural  network 
sub=$%artificial  intelligence 
end 


Searcher  D  —  2  searches  combined  in  user 
file 

@swu@ 
neural 
and 
net 
nets 

%network 
end 

@srtab@ 
neural  net 
neural  nets 
neural  networking 
and 
atr 

automatic  target 
end 


Searcher  6 

@swuups@ 
neural  nets 
and 

target  recognition 
end 
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Searcher  B 


Searcher  D  —  2  strategies  combined  in  file 


@str@ 

•navigation 
•target  detection 
•position  finding 
•range  finding 
and 

•guidance 
•target  detection 
•position  finding 
•range  finding 
end 

@srtab@ 

navigation  and  guidance 
target  detection 
range  detection 
position  finding 
range  finding 
end 

Searcher  C 

@str@ 

$•  targets 
target  detection 
and 

$  •detection 
target  detection 
and 

$range  finding 
$  position  finding 
and 

(navigation 

(guidance 

end 


@str@ 

(range(distance) 

and 

(position(location) 

AND 

(guidance 

and 

(navigation 

and 

(targets 
%target  detect 
and 

%  target  detect 

(detection 

end 

@str@ 

iacs=%guid 

and 

iacs=%navigat 

and 

iacs=%  target 
and 

iacs=%detect 

end 

@srtab@ 

position 

positioning 

and 

guidance 

guided 

guiding 

end 


Searcher  4 

@str@ 

(navigation 

(guidance 

and 

target  detection 
and 

(range  finding 
(position  finding 
end 
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Searcher  5 
@str@ 

$%target  detection 
and 

$%navigation 

$%guidance 

$%range 

$%position  finding 
end 
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Searcher  B 

@swu@ 
navigation 
target  detection 
position  finding 
range  finding 
and 

guidance 

end 

@quftab@ 

navigation  and  guidance 
target  detection 
range  finding 
position  finding 
end 


Searcher  C  —  2  strategies  combined  in  user 
file 

{*  targets 
target  detection 
and 

{•detection 
target  detection 
and 

$range  finding 
{position  finding 
and 

{navigation 

{guidance 

end 

@swu@ 

{targets 

target  detection 
and 

{detection 
target  detection 
and 

{range  finding 
{position  finding 
and 

{navigation 

{guidance 

end 


Searcher  D  —  2  strategies  combined  in  user 
file 

@swu@ 

{range(distance) 

and 

{position(location) 

and 

{navigation 

and 

{guidance 

and 

{targets 
%target  detect 
and 

%target  detect 

{detection 

end 

@swu@ 

range 

ranging 

and 

positioning 

end 

@srtab@ 

target 

targets 

and 

navigation 

navigating 

and 

guidance 

guiding 

end 


Searcher  4 

@swuups@ 

{navigation 

{guidance 

and 

target  detection 
and 

{range  finding 
{position  finding 
end 


A3-17 


Searcher  5 


@swu@ 

sub=$%target  detection 
and 

sub = $  ^navigation 
sub=$%guidance 
sub=$%range 
sub=$%position  finding 
end 


A3-18 


Searcher  A 

@str@ 

radar  countermeasures 
iacs= radar  countermeasures 
{optical  countermeasures 
iacs= optical  countermeasures 
and 

%optical  detect 
{optical  detection 
iacs= optical  detection 
iacs= optical  detectors 
end 


Searcher  C 

@str@ 

{navigation 

and 

{detection 
{detectors 
{optical  detectors 
and 

{countermeasures 
{electronic  countermeasures 
end 


Searcher  E 
@str@ 

{navigation 

and 

{radar  countermeasures 
{optical  detection 
{optical  detectors 
end 


09  TR 


Searcher  2—2  strategies  combined  in  user 
file 

@str@ 

{navigation 
{navigation  aids 
{iff  systems 
and 

{detection 

{detectors 

and  {countermeasures 
end 

@str@ 

{navigation 

{navigation  aids 

{iff  systems 

and 

{radar 

{sonar 

and 

{electronic  warfare 
{infrared  countermeasures 
end 


Searcher  4  —  2  strategies  combined  in  user 
file 

@str@ 

{navigation 

and 

{detection 

and 

{countermeasures 

end 

@str@ 

radar  countermeasures 
{optical  detection 
{optical  detectors 
%detection  countermeasure 
and 

{navigai 

{navigation 

?60navigation 

end 
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Searcher  A 

@swu@ 

nar=navigation 

and 

nar=%detect 

and 

nar = %  countermeasure 
end 


Searcher  C 


Searcher  2  —  2  strategies  combined  in  user 
file 

@str@ 

{navigation 
{navigation  aids 
{iff  systems 

and  : 

{detection 

{detectors 

and  {countermeasures 
end 


@swu@ 

{navigation 

and 

{detection 
{detectors 
{optical  detectors 
and 

{countermeasures 
{electronic  countermeasures 
end 


@str@ 

{navigation 

{navigation  aids 

{iff  systems 

and 

{radar 

{sonar 

and 

{electronic  warfare 
{infrared  countermeasures 
end 


Searcher  E 

@swu@ 

{navigation 

and 

{radar  countermeasures 
{optical  detection 
{optical  detectors 
end 

@qsrtab@ 

navigation 

radar  countermeasures 
end 


Searcher  4  —  2  strategies  combined  in  user 
file 

@swu@ 

{navigation 

and 

{detection 

and 

{countermeasures 

end 

@swu@ 

radar  countermeasures 
{optical  detection 
{optical  detectors 
%detection  countermeasure 
and 

{navigation 

ti=navigation 

end 
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Searcher  D  —  2  strategies  combined  in  user 
file 

@str@ 

%weighted  difference  alg 

%  color  ratio 

end 

@str@ 

%surface  target 
Jvehicles 
%vehicle 
and 

$deception 

%camouflage 

Sdecoys 

%decoy 

and 

%ground  clutter 

%clutter 

end 

@srtab@ 

ir 

infrared 

infra-red 

end 


Searcher  E 

@str@ 

?00infrared  surveillance 

%wide  area  surveillance 

?00%camouflage 

%flir 

and 

ground  targets 

clutter 

end 


Searcher  F 

@str@ 

%ir  surveU 
%infrared  survei 
%wide  area  surv 
%wass(wide 


?60wass 
?00clutter 
%ir  clutter 
%ir  radar  clutter 
%  infrared  clutter 
%infrared  radar  clutter 
%multiband  ir 
%multiband  infrared 
%ir  detection 
%ir  detector 
%  infrared  detect 
%dual  band  ir 
%dualband  ir 
%dual  band  infrared 
%dualband  infrared 
dual  band 
%dual  band  flir 
%dual  band  radar 
daul  band  seeker 
%dual  band  seeker 
%dual  band  transmitter 
and 
targets 

ground  targets 

surface  tra 

surface  targets 

radar  targets 

visual  targets 

military  targets 

camouflasge 

camouflage 

radar  camouflage 

?60target 

?60targets 

?60camouflage 

?60camouflaged 

?60camouflaging 

and 

color  ratio 
color  ratios 
algorithms 
?60algorithm 
?60algorithms 

%weighted  difference  algorithm 

%  color  radar 

%'color  raster 

%color  monitor 

%color  image 

%color  dis 
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Searcher  D 

@swu@ 

color 

and 

ratio 

ratios 

end 


%  color  com 
%  color  cont 
%color  crt 
%  color  cons 
%  color  discrim 
end 


Searcher  5  —  this  search  had  over  13000  hits 
no  downloading  was  attempted 


@srtab@ 

color  ratio 

colors 

color-ratio 

color-ratios 

and 

target 

targets 

end 


Searcher  E 

@swu@ 

de= infrared  surveillance 
%wide  area  surveillance 
de  camouflage 
%flir 
and 

ground  targets 

clutter 

end 

Searcher  F  —  2  strategies  co 

@swu@ 

%ir  surveil 
%infrared  survei 
%wide  area  surv 
%wass(wide 
ti=wass 
de=clutter 
%ir  clutter 
%ir  radar  clutter 
%infrared  clutter 
%  infrared  radar  clutter 
%multiband  ir 
%multiband  infrared 
%ir  detection 
%ir  detector 


@str@ 

$%infrared  surveillance 

infrared  surface  search  and  surveillance 

systems 

$%wide  area  surveillance 

$%clutter 

$%ground  target 

$%camouflage 

algorithms 

$%color  ratio 

$%multiband  infrared 

end 


Searcher  6 

@str@ 

infrared  surveillance 
%wide  area  surveillance 
and 

?60ground 
surface  targets 
%ground  target 
and 

?60%target 
surface  targets 
%  ground  target 
end 


10  wu 


Searcher  D 

@swu@ 

color 

and 

ratio 

ratios 

end 

@srtab@ 

color  ratio 

colors 

color-ratio 

color-ratios 

and 

target 

targets 

end 


Searcher  E 

@swu@ 

de= infrared  surveillance 
%wide  area  surveillance 
de= camouflage 
%fiir 
and 

ground  targets 

clutter 

end 

Searcher  F  —  2  strategies  combined  in  file 

@swu@ 

%ir  surveil 
%  infrared  survei 
%wide  area  surv 
%wass(wide 
ti=wass 
de= clutter 
%ir  clutter 
%ir  radar  clutter 
%  infrared  clutter 
%  infrared  radar  clutter 
%multiband  ir 
%multiband  infrared 
%ir  detection 
%ir  detector 


%  infrared  detect 
%dual  band  ir 
%dual  band  ir 
%dualband  ir 
%dual  band  infrared 
%dualband  infrared 
dual  band 
%dual  band  flir 
%dual  band  radar 
%dual  band  seeker 
%dual  band  transmitter 
and 
targets 

ground  targets 
surface  targets 
radar  targets 
visual  targets 
military  targets 
caamouflage 
camouflage 
radar  camouflage 
ti= target 
ti= targets 
ti=camouflage 
ti= camouflaged 
ti=camouflaging 
and 

color  ratio 
color  ratios 
algorithms 
ti= algorithms 

^weighted  differance  algorithm 

%color  radar 

%color  raster 

%color  monitor 

%color  image 

%  color  dis 

%  color  com 

%color  crt 

%  color  cons 

%colordiscrim 

%color  discrim 

end 

@swu@ 

ti=wassn 

not 

personnel  management 
end 


A3-23 


Searcher  5  —  this  searcher  had  over  10,000 
hits;  no  downloading  was  attempted 

@swu@ 

sub=$%infrared  surveillance 

sub = infrared  surface  search  and  surveillance 

systems 

sub=$%wide  area  surveillance 
sub=$%clutter 
sub=$%ground  target 
sub =5  %camouflage 
sub = algorithms 
sub=$%color  ratio 
sub=$%multiband  infrared 
end 


Searcher  6 

@swu@ 

infrared  surveillance 
%wide  area  surveillance 
end 

@srtab@ 

target 

targets 

end 
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Searcher  A 
@str@ 

(optical  detectors 
and 

%infrared 

and 

%drug 

%narcotic 

(drugs 

end 


Searcher  B 

@str@ 

infrared  detection 
infrared  detectors 
infrared  images 
and 
drugs 

drug  interdiction 
drug  smuggling 
infrared  detectors 
end 

@srtab@ 

drug 

drugs 

interdiction 

end 


Searcher  F 
@str@ 

infrared  detection 
%infrared  detect 
%ir  detect 
and 
drugs 

%drug  detect 

?60drug 

?60drugs 

narcotics 

?60narcotic 

?60narcotics 

end 


Searcher  1 

@str@ 
%ir(infrared 
%  infrared 
and 

%detect 

(detection 

and 

(drugs 

?00drugs 

?00narcotics 

?61drugs 

?61narcotics 

end 


Searcher  5 

@str@ 

(%infrared  detect 
and 

(%drug 

(  %narcoterrorism 

(%narcotic 

end 
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Searcher  A 

@swu@ 

de=$optical  detectors 
and 

sub =%  drug 

nar=%drug 

sub=%narcotic 

nar=%narcotic' 

de= drugs 

and 

sub=%infrared 
nar=%  infrared 
end 


Searcher  B 

@swu@ 

infrared  detection 
infrared  detectors 
infrared  images 
and 
drugs 

drug  interdiction 
drug  smuggling 
end 

Searcher  F 

@swu@ 

infrared  detection 
%infrared  detect 
%ir  detect 
and 
drugs 

%drug  detect 
ti=drug 
ti= drugs 
narcotics 
ti= narcotic 
ti= narcotics 
end 


Searcher  1 

@swu@ 

de=%in  (infrared 
de=%  infrared 
and 

de=$detectors 

and 

de=%drug 

de=$drugs 

end 

Searcher  5 

@swu@ 

sub=$%infrared  detect 
and 

sub=$%drug 

sub = $  %narco  terrorism 

sub=$%narcotic 

end 
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Searcher  A 
@str@ 

target  acquisition 
%  target  acquisition 
and 

computerized  simulation 
mathematical  models 
iacs=computerized  simulation 
%  visual  models 
%  visual  detect 
and 

$  ground  vehicles 
end 


Searcher  B 

@str@ 

visual  perception 
cammouflage 
visual  detection 
and 

detection 
visual  perception 
and 

{military  vehicles 
^military  vehicles 
end 


Searcher  F 
@str@ 

^military  vehicle 
%  combat  vehicle 
and 

color  contrast 
%motion  detect 
%visual  acq 
camouflage 
visual  camouflage 
?60%detect 
?60camouflage 
visual  model 
end 


Searcher  4 

@str@ 

%visual  acquisition 
%visual  detection 
vision 

visual  perception 

visual  surveillance 

visual  targets 

camouflage 

coloring 

colors 

motion 

ground  speed 

and 

%visual  acquisition 
%  visual  detection 
detection 
target  detection 
{optical  detection 
target  acquisition 
target  discrimination 
%motion  detection 
and 

{military  vehicles 
end 


Searcher  5 

@str@ 

%% visual  acquisition 
{%visual  detection 
{% visual  target  acquisition 
{%visual  target  detection 
{^camouflage  (visual 
{% visual  camouflage 
{%color  contrast 
{%motion  detection 
and 

{^military  ground 
{%military  vehicle 
{%ground  vehicle 
{%vehicle 
{%combat  vehicle 
end 
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Searcher  A 
@swu@ 

target  acquisition 
%target  acq 
and 

computerized  simulation 
iacs=computerized  simulation 
algorithms 
%  visual  model 
%visual  detect 
$mathematical  models 
and 

{ground  vehicles 
end 


Searcher  B 
@swu@ 

visual  perception 

camouflage 

and 

detection 

and 

{military  vehicles 
%  military  vehicle 
end 


Searcher  F 

@swu@ 

%military  vehicle 
%combat  vehicle 
and 

color  contrast 
motion  detection 
%motion  detect 
%visual  acq 
camouflage 
visual  camouflage 
ti=%detect 
ti  camouflage 
visual  model 
end 


Searcher  4 

@swuwps@ 

%visual  acquisition 
%visual  detection 
vision 

visual  perception 

visual  surveillance 

visual  targets 

camouflage 

coloring 

colors 

motion 

ground  speed 

and 

%  visual  acquisition 
%visual  detection 
detection 
target  detection 
{optical  detection 
target  acquisition 
target  discrimination 
%motion  detection 
and 

{military  vehicles 
end 


Searcher  5 
@swu@ 

sub={%visual  acquisition 
sub={%visual  detection 
sub={%visual  target  acquisition 
sub={%visual  target  detection 
sub={%camouflage  (visual 
sub={%visual  camouflage 
sub={%color  contrast 
sub={%motion  detection 
and 

sub={%military  ground 
sub={%military  vehicle 
sub={%ground  vehicle 
sub={%vehicle 
sub={%combat  vehicle 
end 
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re¬ 


searcher  A 

@str@ 
night  vision 
night  vision  devices 
and 

thermal  images 
%thermal  imag  • 
iacs=%thermal  imag 
%flir 

forward  looking 
%  forward  looking 
end 


Searcher  B 

@str@ 
m'ght  vision 
night  sights 
m'ght  vision  devices 
and 

thermal  images 

forward  looking  infrared  systems 
Air 

thermal  imagery 
electrooptical  photography 
end 


Searcher  C 

@str@ 

%m*ght  vision 

and 

%flir 

forward  looking  infrared  radar 

%  forward  looking  infrared  (Dir 

infrared  images 

%thermal  imager 

thermography 

electrooptics 

electrooptical  photography 
end 


Searcher  3 

@str@ 

m'ght  vision 

m'ght  vision  devices 

%day/m*ght 

%  night  sight 

and 

thermal  images 

thermal  imagery 

electron  optics 

?60electrooptical 

?OOelectrooptical 

forward  looking  infrared  systems 

?00Dir 

?600ir 

end 


Searcher  4 

@str@ 

%night  sight% 

%  night  seeing 
night  vision 
and 

%  thermal  imag 
%electrooptic 
%electrooptical 
%electro -optic 
%flir% 

%flir(forward 

%forward  looking  i 

and 

Ssights 

^sighting 

%night  vision  devices 
%flir% 

%flir(forward 
%forward  looking  i 
end 
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Searcher  A 

@swu@ 

de=night  vision 

de= night  vision  devices 

and 

%flir 

de= forward  looking 
%forward  looking 
end 


Searcher  B 

@swu@ 
night  vision 
night  sights 
night  vision  devices 
thermal  images 

forward  looking  infrared  systems 
Air 

thermal  imagery 
electrooptical  photography 
end 

@srtab@ 

thermal  imagery 

forward  looking  infrared 

electrooptical 

flu- 

sighting  devices 
and 

night  vision 
end 


Searcher  C 
%flir 

forward  looking  infrared  radar 
%forward  looking  infrared  (flu- 
infrared  images 
%thermal  imager 
thermography 
electrooptical  photography 
and 

night  vision 
end 


Searcher  3 

@swuwps@ 
night  vision 
night  vision  devices 
%day/night 
%night  sight 
and 

thermal  images 

thermal  imagery 

electron  optics 

ti=electrooptical 

sub =electrooptical 

forward  looking  infrared  systems 

sub=flir 

ti=flir 

end 

Searcher  4 

@swuwps@ 

%night  sight% 

%night  seeing 
night  vision 
and 

%thermal  imag 

%electrooptic 

%electrooptical 

%electro-optic 

%flir% 

%flir(forward 

%forward  looking  i 

and 

{sights 

%sighting 

%  night  vision  devices 
%flir% 

%flir  (forward 
%  forward  looking  i 
end 
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Searcher  A 

@str@ 

{coastal  regions 

@str@ 

and 

{detection 

{  mines(ordnance) 

iacs=%detect 

{mine  warfare 

and 

and 

$mines(ordnance) 

{detection 

and 

%imag 

end 

end 

@str@ 

%  coastal 
coast 

Searcher  B 

coasts 

%coastline 

@str@ 

%littoral 

beach  reconnaissance 

?00%surf 

minefield  detection 

and 

beaches 

{mines(ordnance) 

minefields 

{mine  warfare 

and 

and 

target  detection 

{detection 

imaging  detection 
imaging  devices 

end 

infrared  images 

@str@ 

image  processing 

{shores 

target  recognition 

and 

beach  reconnaissance 

{mines(ordnance) 

end 

Searcher  D  —  4  strategies  combined  in  user 

%  minefield 
{mine  warfare 
end 

file 

@srtab@ 

detect 

@str@ 

detects 

?60%beach 

detected 

?60shore 

detection 

?60shores 

detecting 

?601ittoral 

?60surf 

and 

?60reconnaissance 

end 

end 
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Searcher  E 


@str@ 

%minefield  detection 
%minefield  breach 
end 

@qsrtab@ 

beach 

reconnaissance 

imaging 

end 


Searcher  3 

@str© 

*mine  detection 
?60minefield 
?00mineGeld 
and 

*mine  detection 
?60%detect 
?00%detect 
end 

@srtab@ 

beach 

shore 

beaches 

shores 

end 


Searcher  6 

@str@ 

mine  detection 
mine  detectors 

minefields 

and 

$images 

target  recognition 

target  detection 

and 

shores 

beaches 

beach  heads 

end 
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Searcher  A 

@swu@ 

de=detection 

%detect 

and 

de =$mines(ordnance) 

and 

%imag 

end 


Searcher  B 

@swu@ 

beach  reconnaissance 
minefield  detection 
beaches 
and 

target  detection 
imaging  detection 
imaging  devices 
infrared  images 
image  processing 
target  recognition 
beach  reconnaissance 
end 


@swu@ 

beach 

and 

reconnaissance 

end 

@srtab@ 

beach  reconnaissance 
end 

@swu@ 

%  beach  reconn  aiss 
end 

@swu@ 

littoral 

surf 

and 

mine 


mines 

minefield 

minefields 

end 

@srtab@ 

detect 

detection 

detects 

detect 

detecting 

end 

@swu@ 

mine 

mines 

ordnance 

and 

shore 

shores 

beach 

beaches 

end 

@srtab@ 

detect 

detects 

detecting 

detection 

detected 

detector 

detectors 

end 

@swu@ 

{shores 

beach 

beaches 

%littoral 

surf 

surfing 

and 

{mines(ordnance) 

and 

{detection 

{tracking 

%automatic  target  track 

%atr(automatic 

end 


Searcher  D  —  5  strategies  combined  in  file 
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Searcher  E 

Searcher  3 

@swu@ 

@swuwps@ 

%minefield  detection 

*mine  detection 

%  minefield  breach 

?60minefield 

end 

?00minefield 

and 

@qsrtab@ 

*mine  detection 

beach 

?60%detect 

reconnaissance 

?00%detect 

imaging 

end 

end 

@srtab@ 

beach 

Searcher  F 

beaches 

shore 

@swu@ 

shores 

?00beach 

shoreline 

ti=beach 

shorelines 

ti=beaches 

ti=beachhead 

ti=beachheads 

end 

ti=beachheads 
%beach  recon 

Searcher  6 

surf 

@swuups@ 

surf  zone 

mine  detection 

%surf  zones 

mine  detectors 

%littoral 

minefields 

%  littoral  zon 

and 

ti=littoral 

Simages 

and 

target  recognition 

mines  (ordnance) 

target  detection 

mines 

and 

%mine  detect 

shores 

%minefield 

beaches 

%minefields 

beach  heads 

%mine  imag 

%navakxxx 

%naval  mine 

%underwater  mine 

automatic  target  recognition 

%atr(automatic  target  r 

end 

end 
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Searcher  C 

@str@ 

$  coatings 
and 

aluminum 

%ivd(ion 

?60ivd 

%ion  vapor  deposi 

?60vapor 

and 

fastenings 

end 


Searcher  D  —  3  strategies  combined  r 
file 

@str@ 

?60ivd 

end 

@srtab@ 

aluminum 

al 

and 

coating 

coatings 

coated 

coats 

end 

@str@ 

?60%fasten 

and 

?60%coat 

end 

@srtab@ 

al 

aluminum 

and 

ivd 

ion  vapor 
end 


@str@ 

$  fastenings 
and 

$coatings 

end 

@srtab@ 

al 

aluminum 

and 

ivd 

ion  vapor 
end 


user  Searcher  E 

@str@ 

fasteners— c 

fasteners-m 

fasteners-nf 

asteners-p 

fasteners-t 

fasteners/corrosion 

?OOfastenings 

and 

coating 

coatings 

end 


Searcher  4 

@str@ 

{fastening 

%fastener 

?60%fastener 

and 

{coatings 

%ivd(ion 

%ion  vapor  deposition 
{corrosion  inhibition 
%  corrosion  protection 
end 
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Searcher  5 


@str@ 

$%metal  coating 

$%aluminum  coating 

$%fastener 

$%coaating 

$%coating 

and 

$%ivd(ion 

$%ion  vapor  deposit 
$%corrosion  protection 
end 


15  WU 


Searcher  C 

Searcher  E 

@swuwps@ 

@swu@ 

{coatings 

%  fasteners 

and 

de= fasteners 

aluminum 

sub = fasteners 

%ivd(ion 

fasteners,  seals,  clamps 

?60ivd 

nar=fasteners 

%ion  vapor  deposi 

and 

?60vapor 

coatings 

and 

end 

fastenings 

end 

Searcher  4 

Searcher  D  —  2  strategies  combined  in  user 

@swu@ 

file 

{fastening 

%fastener 

@swu@ 

ti=%fastener 

ivd 

and 

and 

{coatings 

%coating 

%  ivd  (ion 

and 

%ion  vapor  deposition 

aluminum 

{corrosion  inhibition 

end 

%corrosion  protection 

end 

@swu@ 

vapor 

and 

Searcher  5 

ion 

and 

@swu@ 

%deposit 

sub={%metal  coating 

and 

sub={%aluminum  coating 

aluminum 

sub={%fastener 

al 

sub={%coating 

and 

and 

%coat 

sub={%ivd(ion 

end 

sub={%ion  vapor  deposit 

sub={%corrosion  protectu 

@uftab@ 

end 

fastener 

fasteners 

fastenings 

end 
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Searcher  A 

Searcher  1  —  2  strategies  combined  in  user 
file 

@str@ 

Sgrenades 

@str@ 

iacs=%grenade 

%grenade 

and 

and 

assembly 

%assemb 

%assembl 

end 

%fabricat 

%manufactur 

@str@ 

disassembly 

%grenade 

end 

and 

%safety  pin 
%safety 

Searcher  C 

and 

%pin 

@str@ 

%device 

grenades 

%fastening 

and 

%fastener 

assembly 

end 

end 

Searcher  5 

Searcher  D 

@str@ 

@str@ 

$%grenade  systems 

?60%grenade 

$%grenades 

end 

$%grenades  xm-77 
and 

@srtab@ 

$%assembly 

m-77 

$%safety  pins 

mil 

end 

m/77 

end 

Searcher  E 

@str@ 

m77  shaped  charges 
end 

A3-38 


16  WU 


Searcher  A 

@swu@ 

%grenade 

@swu@ 

kw=%grenade 

%grenade 

de=%grenade 

de=$grenade 

and 

and 

%safety  pin 

de= assembly 

kw=%safety  ping 

de=dis  assembly 

kw=%safety  pin 

%assembl 

de=%safety  pin 

%fabricat 

%safety 

%manufact 

and 

de=$fabrication 

%>pin 

end 

%device 

^fastening 

%fastener 

Searcher  C 

end 

@swu@ 

grenades 

Searcher  5  —  No  hits;  searcher  did  not  try 

and 

again 

assembly 

end 

@swu@ 

sub=$%grenade  systems 
sub=$%grenades 

Searcher  E 

sub=$%grenades  xm-77 
and 

@swu@ 

sub=$%assembly 

mil 

sub=$%safety  pins 

m-77 

end 

m/77 

end 

Searcher  1  —  2  strategies  combined  in  user 
file 

@swu@ 

%grenade 

de=%grenade 

kw=%grenade 

and 

%assembl 

de=%assembl 

kw=%assembl 

end 
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Searcher  A 

@str@ 

%neural  net 
iacs=%neural  net 
and 

manufacturing 

iacs=%manufact 

end 


Searcher  E 
@str@ 

%neural  network 
end 


Searcher  F  —  2  strategies  combined  in  user 
file 

@str@ 

%neural  net 
%expert  suys 
%expert  sys 
and 

manufacturing 
industrial  plants 
fabrication 
{fabrication 
{molding  techniques 
material  forming 
materials  forming 
composite  fabrication 
composites  fabrication 
iacs = manufacturing 
iacs= fabrication 
iacs = molding  techniques 
iacs=material  forming 
iacs=materials  forming 
iacs = composites 
iacs = composite  materials 
{composite  materials 
end 


@str@ 

?60neural 

and 

?60net 

?60nets 

?60network 

?60networks 

?60networking 

?60networked 

and 

industrial  plants 

fabrication 

?60fabricate 

?60fabricated 

?60fabricating 

?60fabrication 

manufacturing 

?60manufacture 

?60manufacturing 

?60manufacturing 

?60manufactured 

{molding  techniques 

material  forming 

composite  fabrication 

composites  fabricatop  ^  H  ^  Hion 

composites  fabrication 

end 


Searcher  2 

This  searcher  found  it  inconvenient  to 
download  the  search  strategies.  While  the 
investigator  attempted  to  record  each 
strategy  as  it  was  executed,  the  strategy  for 
this  query  appears  to  have  been  missed. 
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Searcher  B 
@str@ 

work  measurement 
job  analysis 
workload  assessment 
%time  in  motion  stud 
time  studies 
and 

workload 

work  measurement 
workload  assessment 
process  improvement 
end 


Searcher  E 

@str@ 

?00%work  measurement 
%work  measurement 
end 


Searcher  2  —  The  first  search  below  had  over 
6100  hits  and  the  second  nearly  1200;  no 
downloading  was  attempted. 

@str@ 

$work  measurement 
%job  analysis 
%work  load 
%job  shop  sched 
%systems  engineering 
end 

@str@ 

Swork  measurement 
%job  analysis 


Searcher  4 

@str@ 

%process  and  product  improvement 
•work 

•work  measurement 

%  process  improvement 

process 

processes 

?60process 

?60processes 

and 

•measurement 
•work  measurement 
%process  improvement 
?60improvement 
?60enhan  cement 
end 

@srtab@ 

work  measurement 
work  enhancement 
process  improvement 
process  enhancement 
process  measurement 
work  measurements 
work  enhancements 
process  improvements 
process  enhancements 
process  measurements 
end 
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Searcher  B 

@swu@ 

work  measurement 
workload  assessment 
job  analysis 
time  studies 
process  improvement 
end 

@srtab@ 

work  measurement 
process  improvement 
end 


Searcher  E 

@swu@ 

de=work  measurement 
%work  measurement 
end 


Searcher  2  —  The  first  search  below  bad  over 
4800  hits  and  the  second  over  1000;  no 
downloading  was  attempted. 

@str@ 

$work  measurement 
%job  analysis 
%work  load 
%job  shop  sched 
%systems  engineering 
end 

@str@ 

$work  measurement 
%job  analysis 


Searcher  4 

@swu@ 

%process  and  product  improvement 
•work 

‘work  measurement 

%  process  improvement  . 

process 

processes 

ti= process 

ti= processes 

and 

•measurement 
•work  measurement 
%  process  improvement 
ti= improvement 
ti=enhancement 
end 

@srtab@ 

work  measurement 
work  enhancement 
process  improvement 
process  enhancement 
process  measurement 
work  measurements 
work  enhancements 
process  improvements 
process  enhancements 
process  meaurements 
process  measurements 
end 
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Searcher  A 

19  TR 

Searcher  1 

@str@ 

@str@ 

Sradar  receivers 

%waming 

and 

and 

Swaming  systems 

^receiver 

and 

and 

$rotaiy  wing  aircraft 

%radar 

fixed  wing  aircraft 

{radar 

end 

and 

Searcher  B 

%helicopter 
%fixed  wing  aircraft 
end 

@str@ 

Sradar 

Searcher  5 

and 

Sreceivers 

@str@ 

and 

radar  receiver 

{aircraft 

%radar  warning  receiver 

end 

%rwr% 

Searcher  C 

%rwr  (radar 
and 

%waming  receiver 

%radar  warning  rec 

%radar  warning  receiver 
%rwr% 

radar  receivers 

%rwr(radar 

and 

and 

Swaming  systems 

{aircraft 

%radar  warning  rec 

end 

and 

Srotary  wing  aircraft 

{helicopters 
fixed  wing  aircraft 

Sjet  aircraft 
{military  aircraft 

tank  aircraft 
commercial  aircraft 
tanker  aircraft 
training  aircraft 
end 

A3 -45 

Searcher  5 


@swuups@ 
radar  receiver 
%radar  warning  receiver 
%rwr% 

%rwr(radar 

and 

%waming  receiver 
%radar  warning  receiver 
%rwr% 

%rwr(radar 

and 

{aircraft 

end 
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Appendix  4 
Sample  Search 

Query  1  executed  by  searcher  C  in  TR  Bibliographic  Database  was  selected  as  an  example. 

Query:  Flight  control  and  instrumentation:  including  displays  and  related  topics 

Search  strategy: 

@str@ 

$*  flight  control  systems 
%flight  display 
flight  instruments 
and 

instrumentation 
$flight  instruments 
end 

This  strategy  retrieved  319  hits,  of  which  70  were  included  in  the  sample  which  was  judged 
for  relevance.  Relevance  judgments  for  all  of  the  100  citations  in  the  sample  for  this  query 
were  as  follows: 

Judged  relevant  or  partially  relevant  by  both  judges . 62 

Judged  relevant  or  partially  relevant  by  judge  1,  not  relevant  by  judge  2  .  27 

Judged  relevant  or  partially  relevant  by  judge  2,  not  relevant  by  judge  1  .  .  1 

Judged  not  relevant  by  both  judges  . 5 

Relevance  considered  non-determinable  by  at  least  one  judge  . 5 

The  95  citations  for  which  both  judges  provided  judgments  were  used  in  the  analysis.  Of 
these  95  sample  citations,  62  (70  percent)  were  judged  relevant  using  the  criterion  of 
concurrence  of  judges,  and  90  (95  percent)  were  judged  relevant  by  either  judge. 

Of  searcher  C’s  70  hits  that  were  included  in  the  sample,  42  were  judged  relevant  by 
concurrence  of  judges,  and  45  by  either  judge.  This  searcher’s  precision  and  recall  ratios 
were  calculated  as  follows: 

Concurrence  of  judges: 

Precision  =  42  relevant  retrieved  +  70  total  retrieved  =  .60 
Recall  =  42  relevant  retrieved  +  62  relevant  in  sample  =  .68 

Either  judge: 

Precision  =  45  relevant  retrieved  +  70  total  retrieved  =  .64 
Recall  =  45  relevant  retrieved  +  90  relevant  in  sample  =  .50 


Appendix  5 


Concurrence  between 
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Appendix  6 


Precision  and  Recall 
by  Query  and  Searcher 
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Appendix  7 
Hits  in  Sample 
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